Primary competition visual

‘AIntuition’: Retrieval Augmented Generation (RAG) for Public Services and Administration Tasks by ITU

$1 500 USD
Challenge completed over 1 year ago
Generative AI
206 joined
21 active
Starti
May 16, 24
Closei
May 17, 24
Reveali
May 17, 24
User avatar
AdeptSchneider22
Kenyatta University
Will the PDF and DOCX documents be structured or unstructured documents?
Data · 3 May 2024, 13:45 · 1

Given that we're supposed to use publicly available documents for building our retrieval system as we wait for the documents to be released to us 2 days before the deadline, will the documents be structured or unstructured? Should we build an ingestion pipeline that will deal with either of the two i.e. structured or unstructured?

Discussion 1 answer

Most of the documents will come unstructured or semi-structured. Ideally, the ingestion pipeline should be optimized to deal with a variety of public sector documents, including normative documents (e.g. recommendations and guidelines), legislative documents (e.g. laws, treaties, and policies), knowledge products (e.g. reports, handbooks), strategic documents (e.g. plans, strategies), etc.

3 May 2024, 21:08
Upvotes 1