Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In the Zip, spectral. npy was the average spectral data of red ginseng, mycotoxins and interference impurities, and label. npy was the corresponding label. Spectral data format was [1200,510] and label data format was [1200,1]. The example of data usage (sklearn in Python database was used to establish the classification model) was as follows:
import numpy as np
from sklearn. model_selection import train_test_split
from sklearn. preprocessing import StandardScaler
from sklearn. neighbors import KNeighborsClassifier
from sklearn. metrics import classification_report, accuracy_score
# Load spectral data and labels
x = np.load('.../spectral.npy')[:,1:-1]
y = np.load('.../label.npy')
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)
# Data standardization
scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
x_test = scaler.transform(x_test)
# Train the KNN model
knn_model = KNeighborsClassifier(n_neighbors=5)
knn_model. fit(x_train, y_train)
# Predict
y_pred = knn_model.predict(x_test)
# Print classification reports and accuracy rates
print("Classification Report:")
print(classification_report(y_test, y_pred))
print("Accuracy Score:")
print(accuracy_score(y_test, y_pred))
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset consists of particulate matter concentration and meteorology data, measured in Singapore, Chinatown, and Central business district from March 13, 2018, to March 16, 2018. The data collectors walked from the Outram district - Chinatown to the Central Business District in Singapore. The measurements were carried out using a hand-held air quality sensor ensemble (URBMOBI 3.0).
The dataset contains information from two URBMOBI 3.0 devices and one reference-grade device (Grimm 1.109). The data from the sensors and Grimm are denoted by the subscript, 's1', 's2', and 'gr', respectively.
singapore_all_pm_25.geojson : The observed PM concentration and meteorology, aggregated using a 25 m buffer around the measurement points.
Information on working with geojson file can be found under GeoJSON .
Units:
PM : µg/m³
Scaled_PM_MM : Dimensionless entity scaled using Min-Max-Scaler (https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html)
Scaled_PM_SS : Dimensionless entity scaled using Standard-Scaler (https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html)
Air temperature: °C
Relative humidity: %
The measurements are part of the "Effects of heavy precipitation events on near-surface climate and particulate matter concentrations in Singapore". It is funded by the support from Humboldt-Universität zu Berlin for seed funding for collaborative projects between National University of Singapore and Humboldt-Universität zu Berlin.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In the Zip, spectral. npy was the average spectral data of red ginseng, mycotoxins and interference impurities, and label. npy was the corresponding label. Spectral data format was [1200,510] and label data format was [1200,1]. The example of data usage (sklearn in Python database was used to establish the classification model) was as follows:
import numpy as np
from sklearn. model_selection import train_test_split
from sklearn. preprocessing import StandardScaler
from sklearn. neighbors import KNeighborsClassifier
from sklearn. metrics import classification_report, accuracy_score
# Load spectral data and labels
x = np.load('.../spectral.npy')[:,1:-1]
y = np.load('.../label.npy')
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)
# Data standardization
scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
x_test = scaler.transform(x_test)
# Train the KNN model
knn_model = KNeighborsClassifier(n_neighbors=5)
knn_model. fit(x_train, y_train)
# Predict
y_pred = knn_model.predict(x_test)
# Print classification reports and accuracy rates
print("Classification Report:")
print(classification_report(y_test, y_pred))
print("Accuracy Score:")
print(accuracy_score(y_test, y_pred))