Kideog Bae, Young Seok Jeon, Yul Hwangbo, Chong Woo Yoo, Nayoung Han, Mengling Feng
Background Breast cancer subtyping is a crucial step in determining therapeutic options, but the molecular examination based on immunohistochemical staining is expensive and time-consuming. Deep learning opens up the possibility to predict the subtypes based on the morphological information from hematoxylin and eosin staining, a much cheaper and faster alternative. However, training the predictive model conventionally requires a large number of histology images, which is challenging to collect by a single institute.
Objective We aimed to develop a data-efficient computational pathology platform, 3DHistoNet, which is capable of learning from z-stacked histology images to accurately predict breast cancer subtypes with a small sample size.
Methods We retrospectively examined 401 cases of patients with primary breast carcinoma diagnosed between 2018 and 2020 at the Department of Pathology, National Cancer Center, South Korea. Pathology slides of the patients with breast carcinoma were prepared according to the standard protocols. Age, gender, histologic grade, hormone receptor (estrogen receptor [ER], progesterone receptor [PR], and androgen receptor [AR]) status, erb-B2 receptor tyrosine kinase 2 (HER2) status, and Ki-67 index were evaluated by reviewing medical charts and pathological records.
Results The area under the receiver operating characteristic curve and decision curve were analyzed to evaluate the performance of our 3DHistoNet platform for predicting the ER, PR, AR, HER2, and Ki67 subtype biomarkers with 5-fold cross-validation. We demonstrated that 3DHistoNet can predict all clinically important biomarkers (ER, PR, AR, HER2, and Ki67) with performance exceeding the conventional multiple instance learning models by a considerable margin (area under the receiver operating characteristic curve: 0.75-0.91 vs 0.67-0.8). We further showed that our z-stack histology scanning method can make up for insufficient training data sets without any additional cost incurred. Finally, 3DHistoNet offered an additional capability to generate attention maps that reveal correlations between Ki67 and histomorphological features, which renders the hematoxylin and eosin image in higher fidelity to the pathologist.
Conclusions Our stand-alone, data-efficient pathology platform that can both generate z-stacked images and predict key biomarkers is an appealing tool for breast cancer diagnosis. Its development would encourage morphology-based diagnosis, which is faster, cheaper, and less error-prone compared to the protein quantification method based on immunohistochemical staining.
Kideog Bae, Young Seok Jeon, Yul Hwangbo, Chong Woo Yoo, Nayoung Han, Mengling Feng
Hyunwoo Park, Jaedong Lee
Federated learning is a decentralized structure for distributed multi-center clinical data research, which is more secure than centralized structures because personal information is not directly shared. However, there are residual threats to information security, such as eavesdropping, training server hacking, and adversarial attacks. This paper presents a hybrid-quantum-key-based secure federated learning (HQK-FL) for distributed multi-center clinical studies. The proposed method is a new approach based on hybrid quantum keys that provides robust security for distributed multi-center disease diagnosis research. We objectively evaluated the effectiveness of the proposed method by experimenting with different models and datasets for predicting coronavirus disease 2019 (COVID-19) and pneumonia using chest X-ray images and predicting sepsis using the Medical Information Mart for Intensive Care (MIMIC-III) dataset, which is a widely used database in medical research. Federated learning showed promising results in improving the accuracy of predicting COVID, pneumonia, and sepsis, and it outperformed the single-center approach. It achieved an average area under the precision–recall curve of 0.791 for COVID, which is 3.7% better than the single-center results. For pneumonia and sepsis, it reached 0.710 and 0.748, which indicates improvements of 6.3% and 3.2%, respectively. We compared and analyzed the resource usage and computational time of HQK-FL through various experiments. HQK-FL can enhance the security of federated learning while maintaining its predictive performance. It can increase the memory usage by up to 4% and slightly increases the computational time. The comparison result showed no significant difference in memory usage and slight differences in the transmission and computational time between the client and server.
Hyunwoo Park, Jaedong Lee
Junetae Kim, Hye Jin Kam, Youngin Kim, Yura Lee, Jae-Ho Lee
Background Mobile apps for weight loss provide users with convenient features for recording lifestyle and health indicators; they have been widely used for weight loss recently. Previous studies in this field generally focused on the relationship between the cumulative nature of self-reported data and the results in weight loss at the end of the diet period. Therefore, we conducted an in-depth study to explore the relationships between adherence to self-reporting and weight loss outcomes during the weight reduction process.
Objective We explored the relationship between adherence to self-reporting and weight loss outcomes during the time series weight reduction process with the following 3 research questions: “How does adherence to self-reporting of body weight and meal history change over time?”, “How do weight loss outcomes depend on weight changes over time?”, and “How does adherence to the weight loss intervention change over time by gender?”
Methods We analyzed self-reported data collected weekly for 16 weeks (January 2017 to March 2018) from 684 Korean men and women who participated in a mobile weight loss intervention program provided by a mobile diet app called Noom. Analysis of variance (ANOVA) and chi-squared tests were employed to determine whether the baseline characteristics among the groups of weight loss results were different. Based on the ANOVA results and slope analysis of the trend indicating participant behavior along the time axis, we explored the relationship between adherence to self-reporting and weight loss results.
Results Adherence to self-reporting levels decreased over time, as previous studies have found. BMI change patterns (ie, absolute BMI values and change in BMI values within a week) changed over time and were characterized in 3 time series periods. The relationships between the weight loss outcome and both meal history and self-reporting patterns were gender-dependent. There was no statistical association between adherence to self-reporting and weight loss outcomes in the male participants.
Conclusions Although mobile technology has increased the convenience of selfreporting when dieting, it should be noted that technology itself is not the essence of weight loss. The in-depth understanding of the relationship between adherence to selfreporting and weight loss outcome found in this study may contribute to the development of better weight loss interventions in mobile environments.
Junetae Kim, Hye Jin Kam, Youngin Kim, Yura Lee, Jae-Ho Lee
Kwang Sun Ryu, Ha Ye Jin Kang, Sang Won Lee, Hyun Woo Park, Na Young You, Jae Ho Kim, Yul Hwangbo, Kui Son Choi, Hyo Soung Cha
A screening model for estimating undiagnosed diabetes mellitus (UDM) is important for early medical care. There is minimal research and a serious lack of screening models for people with a family history of diabetes (FHD), especially one which incorporates gender characteristics. Therefore, the primary objective of our study was to develop a screening model for estimating UDM among people with FHD and enable its validation. We used data from the Korean National Health and Nutrition Examination Survey (KNHANES). KNAHNES (2010–2016) was used as a developmental cohort (n = 5939) and was then evaluated in a validation cohort (n = 1047) KNHANES (2017). We developed the screening model for UDM in male (SMM), female (SMF), and male and female combined (SMP) with FHD using backward stepwise logistic regression analysis. The SMM and SMF showed an appropriate performance (area under curve (AUC) = 76.2% and 77.9%) compared with SMP (AUC = 72.9%) in the validation cohort. Consequently, simple screening models were developed and validated, for the estimation of UDM among patients in the FHD group, which is expected to reduce the burden on the national health care system.
Kwang Sun Ryu, Ha Ye Jin Kang, Sang Won Lee, Hyun Woo Park, Na Young You, Jae Ho Kim, Yul Hwangbo, Kui Son Choi, Hyo Soung Cha
Junetae Kim, Yu Rang Park, Jeong Hoon Lee, Jae-Ho Lee, Young-Hak Kim, Jin Won Huh
Background Cardiac arrest is the most serious death-related event in intensive care units (ICUs), but it is not easily predicted because of the complex and time-dependent data characteristics of intensive care patients. Given the complexity and time dependence of ICU data, deep learning–based methods are expected to provide a good foundation for developing risk prediction models based on large clinical records.
Objective This study aimed to implement a deep learning model that estimates the distribution of cardiac arrest risk probability over time based on clinical data and assesses its potential.
Methods A retrospective study of 759 ICU patients was conducted between January 2013 and July 2015. A character-level gated recurrent unit with a Weibull distribution algorithm was used to develop a real-time prediction model. Fivefold cross-validation testing (training set: 80% and validation set: 20%) determined the consistency of model accuracy. The time-dependent area under the curve (TAUC) was analyzed based on the aggregation of 5 validation sets.
Results The TAUCs of the implemented model were 0.963, 0.942, 0.917, 0.875, 0.850, 0.842, and 0.761 before cardiac arrest at 1, 8, 16, 24, 32, 40, and 48 hours, respectively. The sensitivity was between 0.846 and 0.909, and specificity was between 0.923 and 0.946. The distribution of risk between the cardiac arrest group and the non–cardiac arrest group was generally different, and the difference rapidly increased as the time left until cardiac arrest reduced.
Conclusions A deep learning model for forecasting cardiac arrest was implemented and tested by considering the cumulative and fluctuating effects of time-dependent clinical data gathered from a large medical center. This real-time prediction model is expected to improve patient’s care by allowing early intervention in patients at high risk of unexpected cardiac arrests.
Junetae Kim, Yu Rang Park, Jeong Hoon Lee, Jae-Ho Lee, Young-Hak Kim, Jin Won Huh
Kwang Sun Ryu, Sang Won Lee, Erdenebileg Batbaatar, Jae Wook Lee, Kui Son Choi, Hyo Soung Cha
A screening model for undiagnosed diabetes mellitus (DM) is important for early medical care. Insufficient research has been carried out developing a screening model for undiagnosed DM using machine learning techniques. Thus, the primary objective of this study was to develop a screening model for patients with undiagnosed DM using a deep neural network. We conducted a cross-sectional study using data from the Korean National Health and Nutrition Examination Survey (KNHANES) 2013–2016. A total of 11,456 participants were selected, excluding those with diagnosed DM, an age < 20 years, or missing data. KNHANES 2013–2015 was used as a training dataset and analyzed to develop a deep learning model (DLM) for undiagnosed DM. The DLM was evaluated with 4444 participants who were surveyed in the 2016 KNHANES. The DLM was constructed using seven non-invasive variables (NIV): age, waist circumference, body mass index, gender, smoking status, hypertension, and family history of diabetes. The model showed an appropriate performance (area under curve (AUC): 80.11) compared with existing previous screening models. The DLM developed in this study for patients with undiagnosed diabetes could contribute to early medical care.
Kwang Sun Ryu, Sang Won Lee, Erdenebileg Batbaatar, Jae Wook Lee, Kui Son Choi, Hyo Soung Cha
Young Ki Lee, Yul Hwangbo, Sangwon Lee, Dong-Eun Lee, Eun Kyung Lee, Min Sun Yeom, Jungnam Joo, Sun-Young Kong
Background While aspirin use is known to be associated with reduced incidence of various cancer types, it is unclear whether this benefit extends to thyroid cancer. We aimed to evaluate the association between aspirin use and thyroid cancer development.
Methods This nested case–control study used nationwide data from the Korean National Health Insurance Service-National Sample Cohort 2002–2015. In total, 4547 individuals with newly developed thyroid cancer were matched with 13,641 controls based on age, sex, and follow-up period. Odds ratios (ORs) and 95% confidence intervals (CIs) for thyroid cancer development according to aspirin use were analyzed using a multivariable conditional logistic regression model.
Results The number of days for which patients with thyroid cancer used aspirin (the proportions of no use, <30 days/year, 30–90 days/year, and ≥90 days/year were 93.03%, 6.51%, 0.31%, and 0.15%, respectively) was comparable with that of the controls (p = 0.371, chi-squared test). The risk of thyroid cancer development was not associated with the duration of aspirin use (ORs [CI] for aspirin use <30 days/year, 30–90 days/year, and ≥90 days/year were 1.11 [0.96–1.28], 1.01 [0.54–1.88], and 1.23 [0.50–3.06], respectively, compared with no use) after adjusting for body mass index, smoking status, hypertension, Charlson comorbidity index, and number of outpatient visits per year. In addition, subgroup analyses stratified by age, sex, and follow-up duration did not reveal any significant association between aspirin use and thyroid cancer.
Conclusions Our findings suggest that even extended aspirin use may not impact the prevention or onset of thyroid cancer.
Young Ki Lee, Yul Hwangbo, Sangwon Lee, Dong-Eun Lee, Eun Kyung Lee, Min Sun Yeom, Jungnam Joo, Sun-Young Kong
Yong-Yeon Jo, Young Sang Choi, Hyun Woo Park, Jae Hyeok Lee, Hyojung Jung, Hyo-Eun Kim, Kyounglan Ko, Chan Wha Lee, Hyo Soung Cha, Yul Hwangbo
Image compression is used in several clinical organizations to help address the overhead associated with medical imaging. These methods reduce fle size by using a compact representation of the original image. This study aimed to analyze the impact of image compression on the performance of deep learning-based models in classifying mammograms as “malignant”—cases that lead to a cancer diagnosis and treatment—or “normal” and “benign,” non-malignant cases that do not require immediate medical intervention. In this retrospective study, 9111 unique mammograms–5672 normal, 1686 benign, and 1754 malignant cases were collected from the National Cancer Center in the Republic of Korea. Image compression was applied to mammograms with compression ratios (CRs) ranging from 15 to 11 K. Convolutional neural networks (CNNs) with three convolutional layers and three fully-connected layers were trained using these images to classify a mammogram as malignant or not malignant across a range of CRs using fve-fold cross-validation. Models trained on images with maximum CRs of 5 K had an average area under the receiver operating characteristic curve (AUROC) of 0.87 and area under the precision-recall curve (AUPRC) of 0.75 across the fve folds and compression ratios. For images compressed with CRs of 10 K and 11 K, model performance decreased (average 0.79 in AUROC and 0.49 in AUPRC). Upon generating saliency maps that visualize the areas each model views as signifcant for prediction, models trained on less compressed (CR< = 5 K) images had maps encapsulating a radiologist’s label, while models trained on images with higher amounts of compression had maps that missed the ground truth completely. In addition, base ResNet18 models pre-trained on ImageNet and trained using compressed mammograms did not show performance improvements over our CNN model, with AUROC and AUPRC values ranging from 0.77 to 0.87 and 0.52 to 0.71 respectively when trained and tested on images with maximum CRs of 5 K. This paper fnds that while training models on images with increased the robustness of the models when tested on compressed data, moderate image compression did not substantially impact the classifcation performance of DL-based models.
Yong-Yeon Jo, Young Sang Choi, Hyun Woo Park, Jae Hyeok Lee, Hyojung Jung, Hyo-Eun Kim, Kyounglan Ko, Chan Wha Lee, Hyo Soung Cha, Yul Hwangbo
Junetae Kim, Sangwon Lee, Eugene Hwang, Kwang Sun Ryu, Hanseok Jeong, Jae Wook Lee, Yul Hwangbo, Kui Son Choi, Hyo Soung Cha
Background Despite excellent prediction performance, noninterpretability has undermined the value of applying deep-learning algorithms in clinical practice. To overcome this limitation, attention mechanism has been introduced to clinical research as an explanatory modeling method. However, potential limitations of using this attractive method have not been clarified to clinical researchers. Furthermore, there has been a lack of introductory information explaining attention mechanisms to clinical researchers. Objective: The aim of this study was to introduce the basic concepts and design approaches of attention mechanisms. In addition, we aimed to empirically assess the potential limitations of current attention mechanisms in terms of prediction and interpretability performance.
Methods First, the basic concepts and several key considerations regarding attention mechanisms were identified. Second, four approaches to attention mechanisms were suggested according to a two-dimensional framework based on the degrees of freedom and uncertainty awareness. Third, the prediction performance, probability reliability, concentration of variable importance, consistency of attention results, and generalizability of attention results to conventional statistics were assessed in the diabetic classification modeling setting. Fourth, the potential limitations of attention mechanisms were considered.
Results Prediction performance was very high for all models. Probability reliability was high in models with uncertainty awareness. Variable importance was concentrated in several variables when uncertainty awareness was not considered. The consistency of attention results was high when uncertainty awareness was considered. The generalizability of attention results to conventional statistics was poor regardless of the modeling approach.
Conclusions The attention mechanism is an attractive technique with potential to be very promising in the future. However, it may not yet be desirable to rely on this method to assess variable importance in clinical settings. Therefore, along with theoretical studies enhancing attention mechanisms, more empirical studies investigating potential limitations should be encouraged.
Junetae Kim, Sangwon Lee, Eugene Hwang, Kwang Sun Ryu, Hanseok Jeong, Jae Wook Lee, Yul Hwangbo, Kui Son Choi, Hyo Soung Cha
Yong-Yeon Jo, JaiHong Han, Hyun Woo Park, Hyojung Jung, Jae Dong Lee, Jipmin Jung, Hyo Soung Cha, Dae Kyung Sohn, Yul Hwangbo
Background Postoperative length of stay is a key indicator in the management of medical resources and an indirect predictor of the incidence of surgical complications and the degree of recovery of the patient after cancer surgery. Recently, machine learning has been used to predict complex medical outcomes, such as prolonged length of hospital stay, using extensive medical information.
Objective The objective of this study was to develop a prediction model for prolonged length of stay after cancer surgery using a machine learning approach.
Methods In our retrospective study, electronic health records (EHRs) from 42,751 patients who underwent primary surgery for 17 types of cancer between January 1, 2000, and December 31, 2017, were sourced from a single cancer center. The EHRs included numerous variables such as surgical factors, cancer factors, underlying diseases, functional laboratory assessments, general assessments, medications, and social factors. To predict prolonged length of stay after cancer surgery, we employed extreme gradient boosting classifier, multilayer perceptron, and logistic regression models. Prolonged postoperative length of stay for cancer was defined as bed-days of the group of patients who accounted for the top 50% of the distribution of bed-days by cancer type.
Results In the prediction of prolonged length of stay after cancer surgery, extreme gradient boosting classifier models demonstrated excellent performance for kidney and bladder cancer surgeries (area under the receiver operating characteristic curve [AUC] >0.85). A moderate performance (AUC 0.70-0.85) was observed for stomach, breast, colon, thyroid, prostate, cervix uteri, corpus uteri, and oral cancers. For stomach, breast, colon, thyroid, and lung cancers, with more than 4000 cases each, the extreme gradient boosting classifier model showed slightly better performance than the logistic regression model, although the logistic regression model also performed adequately. We identified risk variables for the prediction of prolonged postoperative length of stay for each type of cancer, and the importance of the variables differed depending on the cancer type. After we added operative time to the models trained on preoperative factors, the models generally outperformed the corresponding models using only preoperative variables.
Conclusions A machine learning approach using EHRs may improve the prediction of prolonged length of hospital stay after primary cancer surgery. This algorithm may help to provide a more effective allocation of medical resources in cancer surgery.
Yong-Yeon Jo, JaiHong Han, Hyun Woo Park, Hyojung Jung, Jae Dong Lee, Jipmin Jung, Hyo Soung Cha, Dae Kyung Sohn, Yul Hwangbo
Hyun Woo Park, Hyojung Jung, Kyoung Yeon Back, Hyeon Ju Choi, Kwang Sun Ryu, Hyo Soung Cha, Eun Kyung Lee, A Ram Hong, Yul Hwangbo
Dual-energy X-ray absorptiometry (DXA) is the gold standard for diagnosing osteoporosis; it is generally recommended in men ≥ 70 and women ≥ 65 years old. Therefore, assessment of clinical risk factors for osteoporosis is very important in individuals under the recommended age for DXA. Here, we examine the diagnostic performance of machine learning-based prediction models for osteoporosis in individuals under the recommended age for DXA examination. Data of 2210 men aged 50–69 and 1099 women aged 50–64 obtained from the Korea National Health and Nutrition Examination Survey IV–V were analyzed. Extreme gradient boosting (XGBoost) was used to find relevant clinical features and applied to three machine learning models: XGBoost, logistic regression, and a multilayer perceptron. For the prediction of osteoporosis, the XGBoost model using the top 20 features extracted from XGBoost showed the most reliable performance with area under the receiver operating characteristic curve (AUROC) of 0.73 and 0.79 in men and women, respectively. We compared the diagnostic accuracy of the Shapley additive explanation values based on a risk-score model obtained from XGBoost and conventional osteoporosis risk assessment tools for prediction of osteoporosis using optimal cut-off values for each model. We observed that a cut-off risk score of ≥ 28 in men and ≥ 47 in women was optimal to classify a positive screening for osteoporosis (an AUROC of 0.86 in men and 0.91 in women). The XGBoost-based osteoporosis-prediction model outperformed conventional risk assessment tools. Therefore, machine learning-based prediction models are a more suitable option than conventional risk assessment methods for screening osteoporosis in individuals under the recommended age for DXA examination.
Hyun Woo Park, Hyojung Jung, Kyoung Yeon Back, Hyeon Ju Choi, Kwang Sun Ryu, Hyo Soung Cha, Eun Kyung Lee, A Ram Hong, Yul Hwangbo
Jae Dong Lee, Hyo Soung Cha, Shailendra Rathore, Jong Hyuk Park
In recent years, the application of a smart city in the healthcare sector via loT systems has continued to grow exponentially and various advanced network intrusions have emerged since these loT devices are being connected. Previous studies focused on security threat detection and blocking technologies that rely on testbed data obtained from a single medical IoT device or simulation using a well-known dataset, such as the NSL-KDD dataset. However, such approaches do not reflect the features that exist in real medical scenarios, leading to failure in potential threat detection. To address this problem, we proposed a novel intrusion classification architecture known as a Multi-class Classification based Intrusion Detection Model (M-IDM), which typically relies on data collected by real devices and the use of convolutional neural networks (i.e., it exhibits better performance compared with conventional machine learning algorithms, such as naïve Bayes, support vector machine (SVM)). Unlike existing studies, the proposed architecture employs the actual healthcare IoT environment of National Cancer Center in South Korea and actual network data from real medical devices, such as a patient’s monitors (i.e., electrocardiogram and thermometers). The proposed architecture classifies the data into multiple classes: Critical, informal, major, and minor, for intrusion detection. Further, we experimentally evaluated and compared its performance with those of other conventional machine learning algorithms, including naïve Bayes, SVM, and logistic regression, using neural networks.
Jae Dong Lee, Hyo Soung Cha, Shailendra Rathore, Jong Hyuk Park
Jaedong Lee, Phillip Park, Sumi Ryu, Hyosoung Cha
When using personal information, researchers face difficulty complying with Korea’s Personal Information Protection Act. Therefore, a clinical common data model (CDM)-based research methodology for multiinstitutional sharing of code and statistical analysis results has been used. However, the current multiinstitutional CDM study environment lacks the considerations of personal information protection or institutional arrangements. Therefore, we propose a two-factor secure framework for the clinical distributed multicenter study environment. In this framework, two-factor (security status and security awareness) are considered. The security status is based on applying objective security factors and technologies to the clinically distributed multicenter study infrastructure and related systems. The security awareness is based on objective security factors for a user-oriented security application to a clinically distributed multicenter study environment. This two-factor assessment-based approach identifies objective factors for complex clinically distributed multicenter study environments, research procedures, and users, as well as applies security factors in detail. The proposed framework investigates the thoughts of users who use the CDM and are known to be safe so far. It compares and analyzes the security status of the CDM and reflects it in the framework design to enhance security and support a smooth clinical data use environment.
Jaedong Lee, Phillip Park, Sumi Ryu, Hyosoung Cha
Title | Pending ID | Pending ID Date | Issued ID | Issued Date | Where |
---|---|---|---|---|---|
의료기관 간의 연합 학습 시스템 및 방법, 이를 포함하는 질환 예후 예측시스템 |
Pending ID
PCT/KR2023/014218
|
2023-09-20 |
Issued ID
|
2023-09-20 | PCT |
의료기관 간의 연합 학습 시스템 및 방법, 이를 포함하는 질환 예후 예측시스템 |
Pending ID
10-2022-0140554
|
2022-10-27 |
Issued ID
|
2022-10-27 | KR |
뇌전도 신호에 기초하여 마취심도를 예측하는 방법 및 상기 방법을 수행하는 마취심도 예측 장치 |
Pending ID
10-2021-0013881
|
2021-02-01 |
Issued ID
10-2474-0850000
|
2022-11-30 | KR |
이재동, 이예림, 유재용, 박현우, 황보율
This standard proposes a way to build unstructured nursing records into voice data, the clinical data for training speech recognition models optimized for medical environments. After collecting nursing records and anonymizing them to protect personal information, a script is generated to build them as clinical data. Scripts are generated by dividing them into scenario scripts to be trained together in the speech learner and pronunciation scripts so that all speakers can record them consistently. Voice data is generated by recording based on a pronunciation script. After examining the data at each stage, scenario scripts and voice data, which are clinical data, are finally established.
URL : https://www.tta.or.kr/tta/ttaSearchView.do?key=77&rep=1&&searchCate=TTAS&searchStandardNo=TTAK.KO-10.1360
이재동, 이예림, 유재용, 박현우, 황보율
이예림, 이재동, 박현우, 안선희, 백경연, 황보율
This standard proposes a construction plan to utilize digital pathological image data and clinical information metadata as AI data to develop a diagnostic-aid AI model. After standardizing pathological images generated in various formats, image data and metadata are de-identified to protect personal information. After proceeding with lesion annotation and tiling with image data, the suitability of learning data is verified. Image data and unidentified metadata as final products can be used as data for artificial intelligence learning.
URL : http://www.tta.or.kr/data/ttas_view.jsp?rn=1&pk_num=TTAK.KO-10.1303
이예림, 이재동, 박현우, 안선희, 백경연, 황보율
박현우, 황보율
The standard provides a structured standard form for mammography image and meta data collected from various hospitals. Various hospital collected mammography image follow the DICOM international standard, but due to the personal identification problem, it is necessary to define a meta data structure and de-identify DICOM tags to apply artificial intelligence technology for breast cancer detection. The proposed standard method can structured and de-identified the mammography images and meta data from multi-center. By releasing structured data, it is possible to develop a disease detection model using artificial intelligence technology in the industry and academia field.
URL : http://www.tta.or.kr/data/ttas_view.jsp?rn=1&pk_num=TTAK.KO-10.1276
박현우, 황보율