Data Practice
What’s New? ✨
Section titled “What’s New? ✨”
vowl
NewValidation engine for ODCS data contracts. Define your validation rules in a YAML data contract and get actionable reports on your data's quality.
Synthetic Data Primer
A practitioner's guide to synthetic data — from what it is and why it matters, to generation methods (classical and LLM-driven), privacy, quality,...
CRISPr Framework
UpdateNew Data Quality Monitoring Chapter Added.
Highlights
Section titled “Highlights”
Behind Singapore Government's Data Engineering | #GovTechDecoded Ep 7
🤖 From chatbots answering your queries to digital services that just work, AI and data play a huge role behind the scenes. But how do we ensure they...
STACK Meetup - Data Standards that Scale: Engineering Effective Data Platforms
In this STACK Meetup, hear from the GovTech’s Data Practice and SG Gov teams on how data platforms are engineered to optimise the sharing of data sets...
Events
Section titled “Events”Primers, Frameworks and Playbooks
Section titled “Primers, Frameworks and Playbooks”
Synthetic Data Primer
A practitioner's guide to synthetic data — from what it is and why it matters, to generation methods (classical and LLM-driven), privacy, quality,...
CRISPr Framework
A structured framework that addresses the complexities of designing, implementing, and monitoring data pipelines.
CRISPr Implementation Guide
A collection of technical guides that are based on proven use cases and approaches, offering detailed, step-by-step instructions to deploy reusable...
Data Engineering Initiative Playbook
This data engineering initiatives playbook is designed as a comprehensive guide for organisations embarking on their data engineering journey.
Research & Articles
Section titled “Research & Articles”Data & AI
Section titled “Data & AI”
SLENDER: Structured Outputs for SLM-based NER in Low-Resource Englishes
Nicole Ren, James Teo. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track). 2025.
Data Engineering in the Age of AI: New Horizons in Data Discovery
Data Engineering in the Age of AI: New Horizons in Data Discovery
Building Fill.sg, a GenAI Report Toolkit
FulFILL your dreams of having AI write repetitive reports for you. Find out how we built Fill.sg in 2 weeks at LAUNCH!
Applied LLM Quantisation with AWS Sagemaker | Analytics.gov
Host production-ready LLMs endpoints at twice the speed but one fifth the cost.
Data Privacy
Section titled “Data Privacy”
Synthetic Data: Navigating Its Methodologies, Applications and Challenges
It’s beyond doubt that our current technological era is defined by data-driven decision-making, with a pursuit of AI that possesses…
Sharing Data with Differential Privacy: A Primer
This article is the first in a four-part series on Differential Privacy by GovTech’s Data Privacy Protection Capability Centre (DPPCC)…
Practitioners’ Guide to Accessing Emerging Differential Privacy Tools
This article is the second in a four-part series on Differential Privacy by GovTech’s Data Privacy Protection Capability Centre (DPPCC)…
Evaluating Differential Privacy Tools’ Performance
This article is the third in a four-part series on Differential Privacy by GovTech’s Data Privacy Protection Capability Centre (DPPCC)…
Getting Started with Scalable Differential Privacy Tools on the Cloud
This article is the final article in a four-part series on Differential Privacy by GovTech’s Data Privacy Protection Capability Centre…
Agency Engagements
Section titled “Agency Engagements”
The Unsung Hero: How data engineering powers insights for public health in Singapore
The Unsung Hero: How Data Engineering Powers Insights for Public Health in Singapore
Productionising LLMs and ML Models with Analytics.gov: MOM’s Journey into AI Solution Deployment
Shoutout to our co-contributors for this article: MOM Forward Deployed Team (Barry Tng, Ethan Mak, Joel Koo), and Container Stack team…
Empowering Public Officers: A Data Engineering Masterclass for the Public Service
Get insights from the inagural Data Engineering Masterclass!
Accelerating Machine Learning and AI impact with MLOps on Analytics.gov
Introduction to Analytics.gov
Internship Stories
Section titled “Internship Stories”
My Internship Journey in DSAID-DE @ Cloak — The Central Privacy Toolkit
Hello everyone! I’m Sean, a final-year computer science major at Nanyang Technological University.
GovTech enCRYPT — Free-text Anonymisation
The Central Privacy-Preserving Toolkit (enCRYPT)
Unleash Data Potential @ Analytics.gov — My Experience as a Software Engineer Intern
Hi everyone! 😀 I’m Pun Pun, a third-year Computer Science student at the National University of Singapore (NUS).
My Internship at DSAID-DE - Data Infrastructure as a Box
The COVID-19 pandemic accelerated digital transformation across all sectors in Singapore, including the public sector. Many businesses had to adapt to...