Show simple item record

dc.contributor.authorVahldiek, Kai
dc.contributor.authorZhou, Libing
dc.contributor.authorZhu, Wenfeng
dc.contributor.authorKlawonn, Frank
dc.date.accessioned2021-08-26T09:53:25Z
dc.date.available2021-08-26T09:53:25Z
dc.date.issued2021-01-01
dc.identifier.citation2021,Intelligent Data Analysis,25(4) pp.789-807; DOI:10.3233/IDA-205253.en_US
dc.identifier.issn1088467X
dc.identifier.doi10.3233/IDA-205253
dc.identifier.urihttp://hdl.handle.net/10033/623003
dc.description.abstractArtificial or simulated data are particularly relevant in tests and benchmarks for machine learning methods, in teaching for exercises and for setting up analysis workflows. They are relevant when real data may not be used for reasons of data protection, or when special distributions or effects should be present in the data to test certain machine learning methods. In this paper a generator for multivariate numerical data with arbitrary marginal distributions and – as far as possible – arbitrary correlations is presented. The data generator is implemented in the open source statistics software R. It can also be used for categorical variables, if data are generated separately for the corresponding characteristics of a categorical variable. Additionally, outliers can be integrated. The use of the data generator is demonstrated with a concrete example.en_US
dc.description.sponsorshipBundesministerium für Wirtschaft und Energieen_US
dc.language.isoenen_US
dc.publisherIOS Pressen_US
dc.rightsAttribution 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.subjectcorrelationsen_US
dc.subjectData generatoren_US
dc.subjectdata setsen_US
dc.subjectdistribution functionsen_US
dc.subjectsimulationsen_US
dc.titleDevelopment of a data generator for multivariate numerical data with arbitrary correlations and distributionsen_US
dc.typeArticleen_US
dc.identifier.eissn15714128
dc.contributor.departmentHZI,Helmholtz-Zentrum für Infektionsforschung GmbH, Inhoffenstr. 7,38124 Braunschweig, Germany.en_US
dc.identifier.journalIntelligent Data Analysisen_US
dc.identifier.eid2-s2.0-85110719560
dc.identifier.scopusidSCOPUS_ID:85110719560
dc.source.volume25
dc.source.issue4
dc.source.beginpage789
dc.source.endpage807
refterms.dateFOA2021-08-26T09:53:26Z
dc.source.journaltitleIntelligent Data Analysis


Files in this item

Thumbnail
Name:
vahldiek_et_al.pdf
Size:
686.8Kb
Format:
PDF
Description:
accepted manuscript

This item appears in the following Collection(s)

Show simple item record

Attribution 4.0 International
Except where otherwise noted, this item's license is described as Attribution 4.0 International