Proof · Trust Evidence

Proof,
not promises.

Data that is usable for AI execution, privacy-safe in production, and stable across runs. The case records, certifications, patents, awards, and partnerships that prove it, in one place.

< 4 hrs Root cause identification 21 days → 4 hrs · 99% faster
88.55% F1-score (DTS augmentation) 58.55% → 88.55% · +30pp
1 day Model time-to-deploy 4 weeks → 1 day · 90% faster
Customers & Partners

Across banking, insurance, legal, the public sector, and telecom.

ISO/IEC 27001:2022 Information SecurityISO/IEC 42001:2023 AI ManagementGS Certified Grade 1, CUBIG 2025GS Certified Grade 1, LLM Capsule 2024KISA Fast Track 2024Information Security Innovation Award 2024Emerging AI+X Top 100 2026 (AIIA)NextRise Global Innovator 2024Deutsche Telekom T Challenge 2026 FinalistIntellyx Digital Innovator Award 2026Startup World Cup Finalist 2024AI Medical Innovation Award, AI EXPO KOREA 2025ISO/IEC 27001:2022 Information SecurityISO/IEC 42001:2023 AI ManagementGS Certified Grade 1, CUBIG 2025GS Certified Grade 1, LLM Capsule 2024KISA Fast Track 2024Information Security Innovation Award 2024Emerging AI+X Top 100 2026 (AIIA)NextRise Global Innovator 2024Deutsche Telekom T Challenge 2026 FinalistIntellyx Digital Innovator Award 2026Startup World Cup Finalist 2024AI Medical Innovation Award, AI EXPO KOREA 2025
Certifications & Awards

Backed by international certifications and industry awards.

Operational Evidence

From PoC to production.

Financial Services
Model retraining pipeline: schema drift detection
Execution Stability
21 d Root cause time (before)
< 4 hr Detection time (after)
2 Feature columns removed
1 Schema type coercion
Before

Schema change in upstream data caused silent model degradation. Root cause took 21 days to identify. By then, downstream decisions had already been affected.

After

Syntitan Release State detected the schema diff at ingestion. Issue flagged before the next training run triggered. No degraded model reached production.

What Changed

2 feature columns removed from upstream feed. 1 schema type coercion introduced silently. Release State diff surfaced both in the change log.

Reproduce

Prior run re-executed under locked Release State conditions. Root cause identification time: 21 days → under 4 hours (~99% reduction).

State Card Change Log Re-run Record Schema Diff
Telecom
Real-time inference service: pipeline version rollback
Execution Stability
Unknown Drift source (before)
< 2 hr Rollback time (after)
100% Score distribution match
Before

Preprocessing pipeline update produced inconsistent scores in production. No way to trace which version caused the score drift.

After

Run Binding linked every score to its exact Release State. Rollback to stable state completed in under 2 hours.

What Changed

Normalization logic updated across the preprocessing step. Feature scaling range shifted by 12%. Release State diff identified both changes with exact pipeline version reference.

Reproduce

Stable release re-run confirmed. Score distribution matched baseline output. Run Binding record archived for future regression checks.

State Card Change Log Re-run Record
Manufacturing
Quality inspection model: rare defect class coverage
Data Usability
3 Underrepresented classes
+30pp F1-score improvement
DP-safe Synthesis method
Before

Rare defect class underrepresented in training data. F1-score capped at 58.55%. Model missed edge cases in production.

After

DTS generated differentially-private synthetic samples for 3 underrepresented classes. F1-score rose to 88.55% (+30pp). Coverage gap closed before next training cycle.

What Changed

3 underrepresented defect classes augmented with DP-safe synthetic data. Class distribution rebalanced. Augmented dataset versioned within Syntitan Release State.

Reproduce

Augmented dataset versioned and bound to Release State. Same training run reproducible on demand. Defect detection recall verified against holdout.

State Card Dataset Version Re-run Record Class Dist. Log
Healthcare
Clinical AI validation: restricted patient data replacement
Data Usability
Blocked Validation status (before)
Unblocked Validation status (after)
DP-safe Synthesis method
Before

Real patient records required for model validation could not be accessed due to regulatory constraints. Validation pipeline stalled.

After

DTS generated DP-safe synthetic patient records matching real distribution characteristics without containing real identifiable information. Validation unblocked.

What Changed

Non-accessible real records replaced with DP-safe synthetic equivalents. Data distribution preserved. Compliance review passed. Validation pipeline resumed without modification.

Reproduce

Synthetic dataset versioned in Syntitan. Validation run reproducible with same synthetic distribution on demand. Audit trail maintained throughout.

State Card DP Audit Log Dataset Version
Insurance
LLM-assisted claims processing: sensitive data substitution
Secure LLM Usage
Exposed Sensitive data in prompts (before)
Substituted Sensitive fields (after)
Preserved Output usability
Before

Claims documents containing policyholder names, ID numbers, and medical details were sent directly to an external LLM API. Compliance team blocked the workflow.

After

LLM Capsule substituted sensitive fields with restorable stand-ins before submission. Outputs returned and reconstructed locally for downstream system use.

What Changed

LLM Capsule layer inserted into the workflow. Substitution covered names, IDs, dates, and medical field patterns. Sensitive raw values stayed in the local token vault.

Reproduce

Each substitution run logged and bound to Syntitan Release State. Workflow reproducible with same substitution logic for audit and regression verification.

State Card Substitution Log Token Vault Record Re-run Record
Retail / E-commerce
Recommendation engine: runtime environment drift
Execution Stability
Days Debugging time (before)
< 3 hr Root cause identified (after)
Exact Environment reproduced
Before

Recommendation scores degraded after a routine infrastructure upgrade. Engineers could not reproduce the pre-upgrade behavior.

After

Syntitan captured every runtime parameter in the Release State at execution time, with Run Binding tying the run to that state. Pre-upgrade Release State re-run in under 3 hours.

What Changed

Library version bump changed default float precision handling. Embedding normalization behavior altered. Release State diff identified the exact library version delta.

Reproduce

Pre-upgrade Release State restored exactly. Score distribution delta measured and confirmed. Infrastructure team patched and validated against the restored baseline.

State Card Runtime Snapshot Change Log Re-run Record
Public Sector
Aggregate-data release: automated screening & audit trail
Execution Stability
Manual Release screening (before)
Automated Screening (after)
0.94 PII detection F1
Multi-agent Detect · trace · transform
Before

Data-center users exporting sensitive aggregate statistics required manual, per-desk screening and release review. The process was inconsistent and hard to audit.

After

A per-desk transformation module plus a multi-agent pipeline detects, traces, and transforms personal information in aggregate data, automating and standardizing the release-review process.

What Changed

Release State fingerprints the data before and after transformation, so which records were changed, and how, stays traceable for audit.

Reproduce

A prior release can be replayed against its bound Release State, reproducing the screening process for regulatory inspection.

Screening Report Release Audit Log Detection Trace State Card
Defense
LLM adoption in air-gapped environments: classified context preserved
Secure LLM Usage
Blocked AI adoption (before)
Enabled AI adoption (after)
0% Raw context egress
N2SF Guideline aligned
Before

In an air-gapped environment, classified context could not be sent outside, so adopting an external LLM for the work stalled before it began.

After

LLM Capsule substitutes the sensitive context with restorable stand-ins locally. Only the substituted capsule reaches the external LLM, and the result is reconstructed locally inside the boundary. The original context stays local, so the team can put AI to work in a form aligned with N2SF guidelines.

What Changed

Sensitive context is substituted with local stand-ins before processing and reconstructed locally afterward. The original stays within the local boundary.

Reproduce

Every encapsulation/restoration event is logged locally, so any processed request can be reconstructed and inspected within the boundary.

Local Token Vault Audit Log N2SF Alignment
Industrial · OT/ICS
OT network data: AI-ready transformation for threat analysis
Data Usability
Restricted Raw OT data (before)
Enabled AI threat analysis (after)
Structure-preserving Transformation
Before

OT/ICS network data carried sensitive operational details, so it could not be sent to an external AI for automated threat analysis.

After

Structure-preserving transformation lets an AI agent analyze the network data and answer threat questions. Sensitive values are replaced with stand-ins while relationships stay intact. (Integrated with a global OT security platform's detection solution.)

What Changed

Network-data sensitive fields are substituted while topology and relationships are kept intact, so the agent can reason over realistic context.

Reproduce

The transformed dataset and the agent's analysis are bound to a fixed data state, so the same analysis can be re-run and verified.

Transformed Dataset Agent Analysis Log Structure Map
Telecom
Network operations model: topology change impact tracing
Execution Stability
Unknown Drift source (before)
Traced Root cause (after)
Diff Topology change surfaced
Before

After a change in the network topology, a NOC AI model's outputs drifted, but engineers could not tell which change caused it, since the model itself was unchanged.

After

Syntitan's Release State captured the network and configuration state bound to each run, and the Diff surfaced the exact topology change; the pre-change state was re-run to confirm the impact.

What Changed

Specific topology and configuration elements changed between runs. The Release State diff identified them against the bound prior run.

Reproduce

The pre-change run was replayed against its bound Release State to confirm the topology change was the cause and to validate the fix.

State Card Topology Diff Change Log Re-run Record
Certifications

Standards,
third-party verified.

Third-party validated certifications across information security, privacy, and operational standards.

ISO 27001 · Information Security Management

Information Security Management

ISO/IEC 27001:2022 · 2026

International standard for information security management. Demonstrates a systematic approach to managing sensitive information.

ISO 42001 · AI Management System

AI Management System

ISO/IEC 42001:2023 · 2026

International standard for AI management systems. Demonstrates responsible AI governance and risk management.

GS Certification Grade 1 · DTS (2025)

GS Grade 1 · DTS

GS Certification Grade 1 · 2025

Korean SW Quality Certification, Grade 1 (2025). Verified quality, eligible for public procurement.

GS Certification Grade 1 · LLM Capsule (2024)

GS Grade 1 · LLM Capsule

GS Certification Grade 1 · 2024

Korean SW Quality Certification, Grade 1 (2024). Listed on the public Innovation Marketplace for procurement.

KISA Fast Track 2024

KISA Fast Track

KISA · 2024

Selected for the KISA information-security industry Fast Track program.

Patents

The patents
behind the tech.

Registered patents and pending applications behind Syntitan, DTS, and LLM Capsule. The technical foundation of the operating layer.

▸ Patent · KR Registered

Method and Data Processing Apparatus for De-identifying Data While Preserving Target Characteristics

KR Reg. No. 10-2926046 · App. No. 10-2023-0167085 · Registered 2026-02-06

Core DTS patent. Method for de-identifying source data while preserving target characteristics such as statistical distributions and label structure.

View patent →
▸ Patent · KR Registered · US Pending

Synthetic Data Generation Method Without Leaking Target Information and Client Apparatus

KR Reg. No. 10-2818137 (App. 10-2024-0017564, Registered 2025-06-04) · US App. No. 19/039,319 (under examination)

DTS synthesis patent. Client-server architecture for generating synthetic data without exposing target information, registered in Korea and pending in the US.

View patent →
▸ Patent · KR Registered · US Pending

Method and Data Processing Apparatus for Generating a Synthetic Dataset Containing Multiple Attributes

KR Reg. No. 10-2818136 · App. No. 10-2024-0131551 · Registered 2025-06-04 · US Pub. No. US 2026/0017275 A1 (under examination)

DTS multi-attribute synthesis patent. Method for generating complex synthetic datasets that span multiple feature columns and attribute types.

View patent →
▸ Patent · KR Registered · US Pending

AI-Based Service Providing Method Without Leaking Private Information and Client Apparatus

KR Reg. No. 10-2757651 (App. 10-2023-0133086, Registered 2025-01-16) · US App. No. 18/908,054 (Filed 2024-10-07, 1st OA response 2026-05-18)

Core LLM Capsule patent. Method and client apparatus for AI services without exposing private information, registered in Korea and pending in the US.

View patent →
▸ Patent · KR Pending

Data Management Method and System for AI Execution Control

KR App. No. 10-2026-0053050 · Filed 2026-03-24 · Expedited examination granted 2026-04-08

Core Syntitan patent application. Method and system for controlling and managing data state within AI execution environments. Expedited examination granted.

▸ Patent · KR US Pending

Method for Providing Security for On-Device Artificial Intelligence Models

KR App. No. 10-2025-0003223 (Filed 2025-01-09) / 10-2026-0000037 (priority, Filed 2026-01-02) · US App. (Ref. PO25-025-US, via export-voucher)

Security provisioning method for AI models running on-device, with Korean priority applications and a corresponding US filing.

▸ Patent · KR US Pending

Method and Data Processing Apparatus for Validating Synthetic Datasets for Model Training

KR App. No. 10-2024-0174041 (Filed 2024-11-28) · US App. No. 19/400,665 (Filed 2025-11-25) · under examination

DTS patent application for validating synthetic datasets used to build training models, filed in both Korea and the US.

▸ Patent · KR Pending

Method and Data Processing Apparatus for Filtering Synthetic Datasets for Model Training

KR App. No. 10-2024-0174042 · Filed 2024-11-28 · Under examination

DTS patent application. Method for filtering synthetic datasets prior to model training.

▸ Patent · KR Pending

Method and Inference Apparatus for Building Deep Learning Models Robust to Private Information Exposure

KR App. No. 10-2023-0074745 · Filed 2023-06-12 · Office Action response due 2026-07-25

Deep learning model construction robust to private information exposure. Applicant: Ewha Womans University (co-research).

▸ Patent · KR Pending

Method and Analysis Apparatus for Building Artificial Intelligence Models that Process Heterogeneous Datasets

KR App. No. 10-2023-0013029 · Filed 2023-01-31 · Under examination (response filed 2026-01-14)

AI model construction method for heterogeneous datasets. Applicant: Ewha Womans University (co-research).

Research

The research
behind the products.

Selected publications by CUBIG founders, from peer-reviewed venues to a widely-cited survey preprint. The privacy and robustness research behind Syntitan, DTS, and LLM Capsule.

Publication · JMLR 2025

Regularizing Hard Examples Improves Adversarial Robustness

Hyungyu Lee, Saehyung Lee, Ho Bae, Sungroh Yoon · Journal of Machine Learning Research · 2025

Adversarial robustness method that regularizes hard examples to improve robust generalization.

Publication · ICLR 2024

DAFA: Distance-Aware Fair Adversarial Training

Hyungyu Lee, Saehyung Lee, Hyemi Jang, Junsung Park, Ho Bae, Sungroh Yoon · ICLR · Vienna, May 2024

Adversarial training method that enforces fairness across subgroups via distance-aware margin adjustment.

Publication · Sensors 2024

Evaluation of Malware Classification Models for Heterogeneous Data

Ho Bae · Sensors (MDPI) · 2024

Study of malware-classifier explainability on heterogeneous data. Existing explanations fall short, and high accuracy can give a misleading sense of security.

Publication · ESORICS 2024

VFLIP: A Backdoor Defense for Vertical Federated Learning via Identification and Purification

Yungi Cho, Woorim Han, Miseon Yu, Younghan Lee, Ho Bae, Yunheung Paek · ESORICS · 2024

First backdoor defense specialized for Vertical Federated Learning. It identifies and purifies backdoor-triggered embeddings at inference.

Publication · BIBM 2023

Privacy-Preserving Publishing of Individual-Level Medical Data for Cloud Services

Ho Bae, Heonseok Ha, Siwon Kim · IEEE BIBM · Istanbul, Dec 2023

Formal privacy-preserving framework for publishing patient-level medical records to cloud services, with emphasis on utility preservation under strict privacy constraints.

Publication · ESORICS 2023

FLGuard: Byzantine-Robust Federated Learning via Ensemble of Contrastive Models

Younghan Lee, Yungi Cho, Woorim Han, Ho Bae, Yunheung Paek · ESORICS · 2023

Byzantine-robust federated learning that detects malicious clients via an ensemble of contrastive models, strong under non-IID data.

Publication · RAID 2023

Exploring Clustered Federated Learning's Vulnerability against Property Inference Attack

Hyunjun Kim, Yungi Cho, Younghan Lee, Ho Bae, Yunheung Paek · RAID · 2023

Reveals property-inference privacy risks in clustered federated learning.

Publication · IEEE/ACM TCBB 2022

DNA Privacy: Analyzing Malicious DNA Sequences Using Deep Neural Networks

Ho Bae, Seonwoo Min, Hyun-Soo Choi, Sungroh Yoon · IEEE/ACM Transactions on Computational Biology and Bioinformatics · 2022

Deep-learning analysis of malicious DNA sequences for security and privacy in genomic data.

Publication · BMVC 2022

MPGAN: Membership Privacy-Preserving GAN

Heonseok Ha, Uiwon Hwang, Jaehee Jang, Ho Bae, Sungroh Yoon · BMVC · London, Nov 2022

GAN training method that prevents membership inference attacks on generated data, providing formal privacy guarantees for synthetic outputs.

Publication · ACM AsiaCCS 2022

Membership Feature Disentanglement Network

Heonseok Ha, J Jang, Y Jeong, S Yoon · ACM Asia Conference on Computer and Communications Security · 2022

Network architecture that disentangles membership-sensitive features from model representations, reducing exposure to membership inference attacks.

Publication · IEEE Access 2021

Gradient Masking of Label Smoothing in Adversarial Robustness

Hyungyu Lee, Ho Bae, Sungroh Yoon · IEEE Access · 2021

Analysis of how label smoothing induces gradient masking, a false sense of robustness that does not transfer to true adversarial settings.

Publication · IEEE TAI 2021

Learn2Evade: Learning-based Generative Model for Evading PDF Malware Classifiers

Ho Bae, Younghan Lee, Yohan Kim, Uiwon Hwang, Sungroh Yoon, Yunheung Paek · IEEE Transactions on Artificial Intelligence · Aug 2021

Adversarial generative modeling of malware evasion: learning to produce feature-space perturbations that bypass PDF malware classifiers while preserving functionality.

Publication · IEEE Access 2020

Anomaly Detection by Learning Dynamics From a Graph

Jaekoo Lee, Ho Bae, Sungroh Yoon · IEEE Access · 2020

Graph-based anomaly detection that learns system dynamics to flag abnormal behavior.

Publication · PSB 2020

AnomiGAN: Generative Adversarial Networks for Anonymizing Private Medical Data

Ho Bae, Dahuin Jung, Hyun-Soo Choi, Sungroh Yoon · Pacific Symposium on Biocomputing · Hawaii, Jan 2020

GAN-based anonymization of private medical datasets while preserving statistical utility for downstream analysis.

Publication · PSB 2019

DNA Steganalysis Using Deep Recurrent Neural Networks

Ho Bae, Byunghan Lee, Sunyoung Kwon, Sungroh Yoon · Pacific Symposium on Biocomputing · Hawaii, Jan 2019

Deep recurrent-network method for detecting hidden messages embedded in DNA sequences (steganalysis), applied to genomic data.

Preprint · arXiv 2018

Security and Privacy Issues in Deep Learning

Ho Bae, Jaehee Jang, Dahuin Jung, Hyemi Jang, Heonseok Ha, Sungroh Yoon · arXiv:1807.11655 · 2018

Comprehensive survey of attack surfaces and defenses in deep learning systems, covering adversarial examples, model extraction, and data poisoning.

Awards & Recognition

Recognized by government
and industry.

From government program selections to industry awards at home and abroad: third-party validation of our technology and business.

Deutsche Telekom T-Challenge 2026 · 2nd Place
Industry Award

Deutsche Telekom T-Challenge 2026 · 2nd Place

T-Mobile / Deutsche Telekom · 2026

Placed 2nd in the T-Challenge global open-innovation program (T-Mobile / Deutsche Telekom), recognized for LLM Capsule's context-preserving substitution and local reconstruction.

2026 Emerging AI+X Top 100
Industry Recognition

2026 Emerging AI+X Top 100

Korea AI Industry Association · 2026

Selected for the 2026 Emerging AI+X Top 100 for its AI-ready data technology.

Selected Supplier · 2026 AI Voucher Program
Government Program

Selected Supplier · 2026 AI Voucher Program

Ministry of Science and ICT · NIPA · 2026

Selected as a supplier for the 2026 AI (Cloud) Voucher program, letting SMEs adopt CUBIG AI-ready data solutions via government vouchers.

Selected Supplier · 2026 Data Voucher Program
Government Program

Selected Supplier · 2026 Data Voucher Program

Korea Data Agency (K-DATA) · 2026

Selected as a Data Voucher supplier, rebuilding restricted enterprise data into AI-ready data with DTS.

Ultra-Gap Startup 1000+ (DIPS 1000+)
Government Program

Ultra-Gap Startup 1000+ (DIPS 1000+)

Ministry of SMEs and Startups · KISED · 2025

Selected in 2025 as a top deep-tech startup (AI / big-data) in the Ultra-Gap Startup 1000+ project, and as a Global ICT Future Unicorn the same year.

NVIDIA Inception
Global Membership

NVIDIA Inception

2024–2025

Member of NVIDIA Inception, the global program for AI startups.

Information Security Product Innovation Award · Minister of Science and ICT Prize (2024)
Government Award

Information Security Product Innovation Award · Minister of Science and ICT Prize

Ministry of Science and ICT · 2024

Grand Prize, Information & Physical Security category, at the 2024 H2 Information Security Product Innovation Awards (Minister of Science and ICT Prize).

Startup World Cup Finalist (2024)
Industry Award

Startup World Cup Finalist

2024

Finalist at the global Startup World Cup.

NextRise Global Innovator (2024)
Industry Award

NextRise Global Innovator

2024

Selected as a NextRise Global Innovator.

SK Telecom × Hana Bank AI Accelerator
Accelerator

SK Telecom × Hana Bank AI Accelerator

SK Telecom · Hana Bank · 2024

Selected for the 2nd SK Telecom × Hana Bank AI startup accelerator (15 of 230 applicants).

Partnerships

The ecosystem
we build with.

Cloud and infrastructure partners that the operating layer composes with.

AWS Marketplace AWS Marketplace DTS · LLM Capsule
Naver Cloud Platform Naver Cloud Platform DTS
FAQ

Frequently asked
questions.

Common questions on operational evidence, reproducibility, and sensitive-data handling.

What is operational evidence in AI systems? +
Operational evidence in AI systems is concrete, verifiable documentation that shows how an AI system behaves in production: what changed between runs, what caused behavioral differences, and whether the same conditions can be replayed to verify what produced a result. It includes before/after outcomes, state comparisons, and re-run records.
What is reproducible AI execution? +
Reproducible AI execution means that given the same execution conditions (data state, schema, preprocessing logic, runtime dependencies), an AI system's run can be replayed to inspect and verify its behavior. Syntitan achieves this through Release State, Run Binding, and Reproduce.
Why is AI execution unstable in production? +
AI execution becomes unstable in production when execution conditions change after deployment: data schemas, preprocessing logic, dependencies, data windows. Any of these can cause AI results to drift without any change to the model itself. When execution state is not captured and fixed, it is hard to pinpoint which change caused a production issue.
How does CUBIG handle sensitive data in LLM workflows? +
CUBIG's LLM Capsule substitutes sensitive values with restorable stand-ins before the data crosses to an external LLM. The original values stay in the local token vault inside the enterprise boundary. The LLM operates on the substituted version; output is reconstructed locally for downstream use. This pattern is the Substitute · Execute · Reconstruct sequence.
Can CUBIG be deployed on-premises or air-gapped? +
Yes. Syntitan and LLM Capsule are designed to support on-premises and air-gapped deployments for regulated and defense workflows. Detailed architecture available on request for qualified evaluations.

The proof is in the execution.

Bring one workflow that isn't reproducible today. We'll show, on your data, what AI-ready execution changes.