Build the data intelligence backbone of Kautilya.
Turn unstructured chaos into searchable, structured knowledge powering AI pipelines at scale.
Impact
Every dataset processed makes democracies smarter. Your pipelines feed LLMs and define how truth is quantified and retrieved.
What you'll do
- • Design data pipelines for ingestion, transformation, and retrieval
- • Build ETL systems that clean and deduplicate at scale
- • Implement vector search, embeddings, and RAG pipelines
- • Manage indexing, caching, and semantic retrieval
- • Work with AI teams on data architecture and freshness
What we value
- • Solid grasp of Python fundamentals and async processing
- • Understanding of data cleaning and deduplication at scale
- • Familiarity with vector databases and embeddings
- • Comfort with FastAPI, background workers, and APIs
- • Bias for speed and efficiency in data processing
Tech Stack
- • Python 3.10+, FastAPI
- • Pandas, Pydantic
- • PostgreSQL, Redis
- • LangChain, LlamaIndex
First 90 Days
- • Build Kautilya Data Pipeline v1
- • Implement embedding generation
- • Integrate semantic search & RAG
- • Create monitoring & analytics
Why join
- • AI-first product
- • Scale + structure
- • Ownership from day one
- • Category-defining problem
Access Challenge
Optimize this Python data processing pipeline. This challenge tests algorithm efficiency, performance optimization, and data engineering intuition — essential skills for our intelligence strategist.
Python Code Optimizer
Optimize this document processing script step by step. Apply Python performance best practices to achieve sub-10 second execution time. Each optimization reveals the next level.
Basic nested loops processing 10k documents. Slow and memory intensive.
Ready to architect data intelligence?
Optimize the pipeline challenge above to unlock the application form and join our data mission.