Publish your training data on Radiant MLHub for NeurIPS 2021
Submissions to the new Datasets and Benchmarks track require data documentation and availability on an open repository.
Organizers of the NeurIPS 2021 conference recently announced a new track for Datasets and Benchmarks. This is a significant development for a major machine learning (ML) conference to highlight the importance of data in developing algorithms for real-world problems. We at Radiant Earth Foundation welcome this initiative and applaud the organizers for establishing this new track.
In recent years, there have been many discussions and arguments to incentive ML researchers to work on real-world problems. One of those incentive mechanisms is the opportunity to publish a paper in a peer-reviewed conference, and getting recognition for working on these problems. The new track at NeurIPS is a necessary step to realize these incentives.
ML algorithms rely heavily on data, and publication and standardization of these data are key for reproducibility and adoption of the final products. The role of data is more important for real-world problems as the training data is usually not readily available. Collecting new data can also be expensive. In addition, the data distribution has significant class imbalances in real-world scenarios, and the training data may not be representative of the whole population. In the case of geospatial datasets, lack of geographical diversity is a key problem.
One of the core goals of Radiant MLHub is to facilitate the publication of ML-ready geospatial training data and provide an easy way for users to find and access existing datasets. Currently, we host 23 high-quality datasets and have a growing community of more than 2,000 users.
Call to Action
We are excited to announce the opportunity to host datasets that researchers would like to submit to NeurIPS 2021. Papers submitted to the new track should have the data already published on an open repository.
We encourage the broader Earth Science community to use this opportunity to publish datasets that can help advance applications of ML in this domain.
If you are interested to host your dataset on Radiant MLHub, please submit your request here.