CSA Preface
Standards development within the Information Technology sector is harmonized with international standards development. Through the CSA Technical Committee on Information Technology (TCIT), Canadians serve as the SCC Mirror Committee (SMC) on ISO/IEC Joint Technical Committee 1 on Information Technology (ISO/IEC JTC1) for the Standards Council of Canada (SCC), the ISO member body for Canada and sponsor of the Canadian National Committee of the IEC.
For brevity, this Standard will be referred to as CSA ISO/IEC 5259-4 throughout.
The International Standard was reviewed by the CSA TCIT under the jurisdiction of the CSA Strategic Steering Committee on Information and Communications Technology and deemed acceptable for use in Canada. This Standard has been formally approved, without modification, by the Technical Committee and has been developed in compliance with Standards Council of Canada requirements for National Standards of Canada. It has been published as a National Standard of Canada by CSA Group.
Scope
This document establishes general common organizational approaches, regardless of the type, size or nature of the applying organization, to ensure data quality for training and evaluation in analytics and machine learning (ML). It includes guidance on the data quality process for:
— supervised ML with regard to the labelling of data used for training ML systems, including common organizational approaches for training data labelling;
— unsupervised ML;
— semi-supervised ML;
— reinforcement learning;
— analytics.
This document is applicable to training and evaluation data that come from different sources, including data acquisition and data composition, data preparation, data labelling, evaluation and data use. This document does not define specific services, platforms or tools.