Framework to develop FPGA applications in C++ with the easiness of PYNQ
Introduction
CYNQ is a C++ framework to implement FPGA-based accelerated applications with the same ease of use as PYNQ framework for Python. This allows users to implement their own applications with better performance than in Python and avoids the long processing times of coding applications with Vitis.
Dependencies
- Meson >= 1.x
- Python >= 3.8
- GCC >= 9.x
- XRT >= 2.13
- Linux FPGA Manager
Index
How does CYNQ look like?
CYNQ is pretty similar to PYNQ, let's have a look.
PYNQ:
from pynq import allocate, Overlay
design = Overlay("design.bit")
dma = design.axi_dma_0
accel = design.multiplication_accel_0
inbuf = allocate(shape=(input_elements,), dtype=np.uint16)
outbuf = allocate(shape=(output_elements,), dtype=np.uint16)
dma.sendchannel.transfer(inbuf)
accel.write(accel.register_map.CTRL.address, 0x81)
accel.write(accel.register_map.n_elements.address, input_elements)
dma.recvchannel.transfer(outbuf)
dma.recvchannel.wait()
del input_hw
del output_hw
With CYNQ for Xilinx Ultrascale+:
#include <cynq/cynq.hpp>
using namespace cynq;
auto kArch = HardwareArchitecture::UltraScale;
const uint64_t accel_addr = 0xa000000;
const uint64_t dma_addr = 0xa0010000;
auto accel = platform->GetAccelerator(accel_addr);
auto dma = platform->GetDataMover(dma_addr);
auto inbuf = mover->GetBuffer(input_size, accel->GetMemoryBank(0));
auto outbuf = mover->GetBuffer(output_size, accel->GetMemoryBank(1));
uint16_t* input_ptr = inbuf->HostAddress<uint16_t>().get();
uint16_t* output_ptr = outbuf->HostAddress<uint16_t>().get();
uint32_t num_elements = 4096;
const uint64_t addr_num_elements = 0x20;
accel->Attach(addr_num_elements, &num_elements, RegisterAccess::RO);
mover->Upload(in_mem, infbuf->Size(), 0, ExecutionType::Async);
accel->Start(StartMode::Continuous);
mover->Download(out_mem, outbuf->Size(), 0, ExecutionType::Sync);
accel->Stop();
static std::shared_ptr< IHardware > Create(const HardwareArchitecture hw, const std::string &bitstream, const std::string &xclbin)
Create method Factory method to create a hardware-specific subclasses for accelerators and data mover...
Definition: hardware.cpp:15
With CYNQ for Alveo
#include <cynq/cynq.hpp>
using namespace cynq;
auto kArch = HardwareArchitecture::Alveo;
auto accel = platform->GetAccelerator("vadd");
auto dma = platform->GetDataMover(0);
auto inbuf = mover->GetBuffer(input_size, accel->GetMemoryBank(0));
auto outbuf = mover->GetBuffer(output_size, accel->GetMemoryBank(1));
uint16_t* input_ptr = inbuf->HostAddress<uint16_t>().get();
uint16_t* output_ptr = outbuf->HostAddress<uint16_t>().get();
const uint32_t num_elements = 4096;
accel->Attach(0, bo_0);
accel->Attach(1, bo_1);
accel->Attach(2, &num_elements, RegisterAccess::RO);
mover->Upload(in_mem, infbuf->Size(), 0, ExecutionType::Async);
accel->Start(StartMode::Once);
mover->Download(out_mem, outbuf->Size(), 0, ExecutionType::Sync);
Currently tested
So far, we have tested CYNQ on:
- Xilinx KV26-based with Ubuntu 2022.04
- Xilinx Alveo U250 (it should be compatible with other similar Alveo cards) - Shell: xilinx_u250_gen3x16_xdma_4_1_202210_1
Links & References:
Cite Us:
@misc{cynq,
author = {{León-Vega, Luis G.
AND Ávila-Torres, Diego
AND Castro-Godínez, Jorge
}},
title = {{CYNQ (v0.3)}},
year = {2024},
url = {https://github.com/ECASLab/cynq},
}