Skip to content
CUBIG
Platform
Capabilities
Proof
Learn
Company
English
English 한국어
Book architecture review Run a sample proof
Syntitan AI-Ready Data Platform
Platform

Syntitan

The AI-ready data platform for real AI execution, taking data from diagnosis to release and binding.

Explore →
LLM Capsule Context-preserving data layer for AI DTS AI-ready data transformation engine
LLM Capsule

Runner-up at T-Challenge 2026

LLM Capsule named runner-up in the Deutsche Telekom & T-Mobile US global innovation program.

Read the news →
Learn Hub Start here Blog In-depth perspectives Articles Practical guides and insights Glossary Key terms in AI-ready data
Learn

AI Insights

CUBIG perspectives and practical insights on AI, AI-ready data, and enterprise transformation.

About Our mission and team News Press releases and updates
CEO

Ho Bae

Building the missing layer for enterprise AI. CUBIG is building the operational data layer that helps enterprises turn sensitive, fragmented, and unusable data into AI-ready, operable data.

Book architecture review Run a sample proof
Platform Syntitan
Capabilities LLM Capsule DTS
Proof
Learn Learn Hub Blog Articles Glossary
Company About News
Glossary

What is Leakage (machine learning)?

Leakage in machine learning refers to unintended exposure of information from training data into the model in a way that artificially inflates its predictive performance. It occurs when test data is improperly included in training or when future information leaks into the training process, leading to overfitting and unreliable real-world model performance.

← Previous Layer (deep learning) Next → Linked Data Platform

Related Glossaries

  • Statistical data coding Statistical data coding is the process of assigning numerical or categorical values to qualitative data for analysis. It is commonly used in surveys, machine learning preprocessing, and econometrics to structure data for statistical modeling.
  • Data Lineage Data lineage is the record of where data comes from, how it moves, and how it is transformed across systems, from source to the table or model that uses it.
  • Data packaging Data packaging refers to the process of structuring and formatting data for easy storage, retrieval, and exchange. It ensures that datasets are standardized, properly labeled, and compatible with various analytical tools, enhancing usability in AI training, big data processing, and…
  • Data Mart Data mart refers to a subset of a data warehouse that is focused on a specific business function or department. It provides tailored access to relevant data, improving query performance and decision-making for targeted analytics and reporting.
CUBIG

Platform

  • Syntitan

Capabilities

  • LLM Capsule
  • DTS

Proof & Learn

  • Proof
  • Learn Hub
  • Blog
  • Articles
  • Glossary

Company

  • About
  • News

Connect

  • Book architecture review
  • LinkedIn
  • Medium
  • YouTube
  • Instagram
  • Naver Blog
  • X (Twitter)

CUBIG LTD (United Kingdom)
Company Number: NI735459
21 Arthur Street, Belfast, Antrim, United Kingdom, BT1 4GA

CUBIG CORP (Republic of Korea)
Business Registration: 133-81-45679
E-Commerce Registration: 2023-Seoul-Seocho-2822
4F, NAVER 1784, 95, Jeongjail-ro, Bundang-gu, Seongnam-si, Gyeonggi-do, Republic of Korea

©️ 2026 CUBIG Corp. All Rights Reserved.
Cookie Policy Privacy Policy
Gartner does not endorse any vendor, product or service depicted in its research publications. GARTNER is a registered trademark of Gartner, Inc. and/or its affiliates.