Advanced Data Movement
Overview¶
The data movement is essential for tileflow programs. In this section, you will learn more advanced syntax of DMA statements to support different scenarios.
Advances in Data Expression¶
Reshape with .span_as¶
Dimension Composition inside chunkat¶
Full Non-Blocking DMA Mode (Chain Mode in Choreo)¶
In Choreo, it's possible to perform a full non-blocking DMA by chaining multiple asynchronous DMA operations and using event-based notifications with after. This enables complete non-blocking execution, where one DMA operation is triggered only after the completion of a prior one. Here’s an example of such a setup:
out_store = dma.copy.async l2_out => output.chunkat(m_tile, n_tile) after out_store_s;
In this example:
out_storeis the asynchronous DMA operation that transfers data from l2_out to output.chunkat(m_tile, n_tile).out_store_sis another DMA operation or event that must complete before out_store can proceed.- The after out_store_s syntax specifies that
out_storeshould only start after the completion ofout_store_s, which ensures that there is no blocking in the main thread.
In this case, neither out_store nor out_store_s will block the main program flow. The program continues executing while these DMA operations are handled in the background. The key difference here is that the completion of out_store_s triggers the start of out_store, creating an event-driven dependency between the two DMA operations. This model enables highly efficient and non-blocking memory transfers.