Third grade English Language Arts (ELA) and Math test results for the 2016-2017 school year by census tract for the state of Michigan. Data Driven Detroit obtained these datasets from MI School Data, for the State of the Detroit Child tool in July 2017. Test results were originally obtained on a school level and aggregated to census tract by Data Driven Detroit. Student data was suppressed when less than five students were tested per school.Click here for metadata (descriptions of the fields).
https://data.4tu.nl/info/fileadmin/user_upload/Documenten/4TU.ResearchData_Restricted_Data_2022.pdfhttps://data.4tu.nl/info/fileadmin/user_upload/Documenten/4TU.ResearchData_Restricted_Data_2022.pdf
This file contains raw data for cameras and wearables of the ConfLab dataset.
./cameras
contains the overhead video recordings for 9 cameras (cam2-10) in MP4 files.
These cameras cover the whole interaction floor, with camera 2 capturing the
bottom of the scene layout, and camera 10 capturing top of the scene layout.
Note that cam5 ran out of battery before the other cameras and thus the recordings
are cut short. However, cam4 and 6 contain significant overlap with cam 5, to
reconstruct any information needed.
Note that the annotations are made and provided in 2 minute segments.
The annotated portions of the video include the last 3min38sec of x2xxx.MP4
video files, and the first 12 min of x3xxx.MP4 files for cameras (2,4,6,8,10),
with "x" being the placeholder character in the mp4 file names. If one wishes
to separate the video into 2 min segments as we did, the "video-splitting.sh"
script is provided.
./camera-calibration contains the camera instrinsic files obtained from
https://github.com/idiap/multicamera-calibration. Camera extrinsic parameters can
be calculated using the existing intrinsic parameters and the instructions in the
multicamera-calibration repo. The coordinates in the image are provided by the
crosses marked on the floor, which are visible in the video recordings.
The crosses are 1m apart (=100cm).
./wearables
subdirectory includes the IMU, proximity and audio data from each
participant at the Conflab event (48 in total). In the directory numbered
by participant ID, the following data are included:
1. raw audio file
2. proximity (bluetooth) pings (RSSI) file (raw and csv) and a visualization
3. Tri-axial accelerometer data (raw and csv) and a visualization
4. Tri-axial gyroscope data (raw and csv) and a visualization
5. Tri-axial magnetometer data (raw and csv) and a visualization
6. Game rotation vector (raw and csv), recorded in quaternions.
All files are timestamped.
The sampling frequencies are:
- audio: 1250 Hz
- rest: around 50Hz. However, the sample rate is not fixed
and instead the timestamps should be used.
For rotation, the game rotation vector's output frequency is limited by the
actual sampling frequency of the magnetometer. For more information, please refer to
https://invensense.tdk.com/wp-content/uploads/2016/06/DS-000189-ICM-20948-v1.3.pdf
Audio files in this folder are in raw binary form. The following can be used to convert
them to WAV files (1250Hz):
ffmpeg -f s16le -ar 1250 -ac 1 -i /path/to/audio/file
Synchronization of cameras and werables data
Raw videos contain timecode information which matches the timestamps of the data in
the "wearables" folder. The starting timecode of a video can be read as:
ffprobe -hide_banner -show_streams -i /path/to/video
./audio
./sync: contains wav files per each subject
./sync_files: auxiliary csv files used to sync the audio. Can be used to improve the synchronization.
The code used for syncing the audio can be found here:
https://github.com/TUDelft-SPC-Lab/conflab/tree/master/preprocessing/audio
Data has been processed by NODC to the NODC standard Bathythermograph (XBT) (C116) format. The C116/C118 format contains temperature-depth profile data obtained using expendable bathythermograph (XBT) instruments. Cruise information, position, date and time were reported for each observation. The data record was comprised of pairs of temperature-depth values. Unlike the MBT Data File, in which temperature values were recorded at uniform 5 m intervals, the XBT data files contained temperature values at non-uniform depths. These depths were recorded at the minimum number of points ("inflection points") required to accurately define the temperature curve. Standard XBTs can obtain profiles to depths of either 450 or 760 m. With special instruments, measurements can be obtained to 1830 m. Prior to July 1994, XBT data were routinely processed to one of these standard types. XBT data are now processed and loaded directly in to the NODC Ocean Profile Data Base (OPDB). Historic data from these two data types were loaded into the OPDB.
Data has been processed by NODC to the NODC standard Bathythermograph (XBT) (C116) format.
The C116/C118 format contains temperature-depth profile data obtained using expendable bathythermograph (XBT) instruments. Cruise information, position, date and time were reported for each observation. The data record was comprised of pairs of temperature-depth values. Unlike the MBT Data File, in which temperature values were recorded at uniform 5 m intervals, the XBT data files contained temperature values at non-uniform depths. These depths were recorded at the minimum number of points ("inflection points") required to accurately define the temperature curve. Standard XBTs can obtain profiles to depths of either 450 or 760 m. With special instruments, measurements can be obtained to 1830 m. Prior to July 1994, XBT data were routinely processed to one of these standard types. XBT data are now processed and loaded directly in to the NODC Ocean Profile Data Base (OPDB). Historic data from these two data types were loaded into the OPDB.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Data has been processed by NODC to the NODC standard Bathythermograph (XBT) (C116) format. The C116/C118 format contains temperature-depth profile data obtained using expendable bathythermograph (XBT) instruments. Cruise information, position, date and time were reported for each observation. The data record was comprised of pairs of temperature-depth values. Unlike the MBT Data File, in which temperature values were recorded at uniform 5 m intervals, the XBT data files contained temperature values at non-uniform depths. These depths were recorded at the minimum number of points ("inflection points") required to accurately define the temperature curve. Standard XBTs can obtain profiles to depths of either 450 or 760 m. With special instruments, measurements can be obtained to 1830 m. Prior to July 1994, XBT data were routinely processed to one of these standard types. XBT data are now processed and loaded directly in to the NODC Ocean Profile Data Base (OPDB). Historic data from these two data types were loaded into the OPDB.
Dataset of all the data supplied by each local authority and imputed figures used for national estimates.
This file is no longer being updated to include any late revisions local authorities may have reported to the department. Please use instead the Local authority housing statistics open data file for the latest data.
MS Excel Spreadsheet, 1.26 MB
This file may not be suitable for users of assistive technology.
Request an accessible format.Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Panama PA: Time Required to Obtain an Operating License data was reported at 66.300 Day in 2010. This records an increase from the previous number of 41.200 Day for 2006. Panama PA: Time Required to Obtain an Operating License data is updated yearly, averaging 53.750 Day from Dec 2006 (Median) to 2010, with 2 observations. The data reached an all-time high of 66.300 Day in 2010 and a record low of 41.200 Day in 2006. Panama PA: Time Required to Obtain an Operating License data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Panama – Table PA.World Bank: Company Statistics. Time required to obtain operating license is the average wait to obtain an operating license from the day the establishment applied for it to the day it was granted.; ; World Bank, Enterprise Surveys (http://www.enterprisesurveys.org/).; Unweighted average;
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Data has been processed by NODC to the NODC standard Bathythermograph (MBT) (C128) format. The C128 format is used for temperature-depth profile data obtained using the mechanical bathythermograph (MBT) instrument. The maximum depth of MBT observations is approximately 285 m. Therefore, MBT data are useful only in studying the thermal structure of the upper layers of the ocean. Cruise information, date, position, and time are reported for each observation. The data record comprises pairs of temperature-depth values. Temperature data in this file are recorded at uniform 5 m depth intervals.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The complete dataset used in the analysis comprises 36 samples, each described by 11 numeric features and 1 target. The attributes considered were caspase 3/7 activity, Mitotracker red CMXRos area and intensity (3 h and 24 h incubations with both compounds), Mitosox oxidation (3 h incubation with the referred compounds) and oxidation rate, DCFDA fluorescence (3 h and 24 h incubations with either compound) and oxidation rate, and DQ BSA hydrolysis. The target of each instance corresponds to one of the 9 possible classes (4 samples per class): Control, 6.25, 12.5, 25 and 50 µM for 6-OHDA and 0.03, 0.06, 0.125 and 0.25 µM for rotenone. The dataset is balanced, it does not contain any missing values and data was standardized across features. The small number of samples prevented a full and strong statistical analysis of the results. Nevertheless, it allowed the identification of relevant hidden patterns and trends.
Exploratory data analysis, information gain, hierarchical clustering, and supervised predictive modeling were performed using Orange Data Mining version 3.25.1 [41]. Hierarchical clustering was performed using the Euclidean distance metric and weighted linkage. Cluster maps were plotted to relate the features with higher mutual information (in rows) with instances (in columns), with the color of each cell representing the normalized level of a particular feature in a specific instance. The information is grouped both in rows and in columns by a two-way hierarchical clustering method using the Euclidean distances and average linkage. Stratified cross-validation was used to train the supervised decision tree. A set of preliminary empirical experiments were performed to choose the best parameters for each algorithm, and we verified that, within moderate variations, there were no significant changes in the outcome. The following settings were adopted for the decision tree algorithm: minimum number of samples in leaves: 2; minimum number of samples required to split an internal node: 5; stop splitting when majority reaches: 95%; criterion: gain ratio. The performance of the supervised model was assessed using accuracy, precision, recall, F-measure and area under the ROC curve (AUC) metrics.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Data has been processed by NODC to the NODC standard Bathythermograph (XBT Aircraft) (C118) format. The C116/C118 format contains temperature-depth profile data obtained using expendable bathythermograph (XBT) instruments. Cruise information, position, date and time were reported for each observation. The data record was comprised of pairs of temperature-depth values. Unlike the MBT Data File, in which temperature values were recorded at uniform 5 m intervals, the XBT data files contained temperature values at non-uniform depths. These depths were recorded at the minimum number of points ("inflection points") required to accurately define the temperature curve. Standard XBTs can obtain profiles to depths of either 450 or 760 m. With special instruments, measurements can be obtained to 1830 m. Prior to July 1994, XBT data were routinely processed to one of these standard types. XBT data are now processed and loaded directly in to the NODC Ocean Profile Data Base (OPDB). Historic data from these two data types were loaded into the OPDB.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Sea ice surface roughness data were obtained during the PoleAirship campaign in Apr, 2007 with a Single Beam Laser Altimeter (SBLA) mounted inside an electromagnetic system (EM-Bird) towed at 10-30 m height above surface by a Helicopter (Mil MI-8). A method developed by Hibler (1972) was used to a) isolate the surface profile from low-frequency variations associated with the aircraft motion and b) to identify pressure ridge sails. The processing steps are described in https://epic.awi.de/id/eprint/56364/. We applied a ridge detection threshold of 0.6 m, which means that only sails higher 0.6 m are detected. Version and name of the processing routine: Laser_Altimeter_Processing_VS5_06_20.py (vers.5, Feb 22, 2024, https://gitlab.awi.de/sitem/sbla_processing.git). SBLA records (RIEGL - LD90) are provided at a sampling rate of 100 Hz. Sensor accuracy is 5 cm with a beam diameter at surface of 5.8 cm. This specific dataset was obtained on 20070416T1852. It includes recorded altimeter readings, the derived surface elevation and width/height/spacing of detected pressure ridge sails. Note on data quality: 5.8 cm . File name: [DMS/PANGAEA Campaing Identifier] + [DEVICE] + [DATE/TIME] + [LAT/LON] + [Detection Threshold] + [Object] + [Version] + [Format]
Temperature profile data were collected using XBT and BT casts from NOAA Ship RESEARCHER in the Coastal Waters of Florida from 07 April 1982 to 12 April 1982. Data were collected by the Atlantic Oceanographic and Meteorological Laboratory (AOML) in Miami, Florida. Data were processed by NODC to the NODC standard Universal Bathythermograph Output (UBT) format. Full format description is available from NODC at www.nodc.noaa.gov/General/NODC-Archive/bt.html.
The UBT file format is used for temperature-depth profile data obtained using expendable bathythermograph (XBT) instruments. Standard XBTs can obtain profiles at depths of about 450 or 760 m. With special instruments, measurements can be obtained to 1830 m. Cruise information, position, date, and time are reported for each observation. The data record comprises pairs of temperature-depth values. Unlike the MBT data file, in which temperature values are recorded at uniform 5m intervals, the XBT Data File contains temperature values at non-uniform depths. These depths are at a minimum number of points ("inflection points") required to record the temperature curve to an acceptable degree of accuracy. On output, however, the user may request temperature values either at inflection points or interpolated to uniform depth increments.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Data has been processed by NODC to the NODC standard Bathythermograph (XBT Aircraft) (C118) format. The C116/C118 format contains temperature-depth profile data obtained using expendable bathythermograph (XBT) instruments. Cruise information, position, date and time were reported for each observation. The data record was comprised of pairs of temperature-depth values. Unlike the MBT Data File, in which temperature values were recorded at uniform 5 m intervals, the XBT data files contained temperature values at non-uniform depths. These depths were recorded at the minimum number of points ("inflection points") required to accurately define the temperature curve. Standard XBTs can obtain profiles to depths of either 450 or 760 m. With special instruments, measurements can be obtained to 1830 m. Prior to July 1994, XBT data were routinely processed to one of these standard types. XBT data are now processed and loaded directly in to the NODC Ocean Profile Data Base (OPDB). Historic data from these two data types were loaded into the OPDB.
Rathayibacter toxicus is a gram-positive bacterium that is the causative agent of annual ryegrass toxicity, a disease that causes devastating losses in the Australian livestock industry. This bacterium is poorly characterized, making it difficult to accurately detect in feed samples. Using 1-D gels and mass spectrometry, we analyzed the protein expression of R. toxicus under stationary growth phase conditions to obtain a more complete understanding of the mechanisms of this organism. A total of 333 unique proteins were identified. The data obtained in this analysis is an essential first step toward developing an antibody-based diagnostic assay.
https://www.icpsr.umich.edu/web/ICPSR/studies/7079/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/7079/terms
This study presents data obtained from one-fifth of a national sample of undergraduate students surveyed under the sponsorship of the Carnegie Commission on Higher Education (see CARNEGIE COMMISSION NATIONAL SURVEY OF HIGHER EDUCATION: UNDERGRADUATE STUDY, 1969-1970 [ICPSR 7503]). The original data were collected by the Survey Research Center, University of California at Berkeley, while the subsample was provided by the Social Science Data Center at the University of Connecticut. The subsample for the present study was randomly drawn and the 14,139 respondents were weighted to 1,312,178. Undergraduates were asked to provide information regarding their social and educational backgrounds, as well as their degree and career plans. Variables also elicited students' opinions on their institutions and departments, on educational policy in general, and on broad social and political issues. Demographic data cover age, sex, race, religion, marital status, birthplace, family income, and parents' levels of education.
XBT data were collected from the USCGC ACUSHNET in support of the Integrated Global Ocean Services System (IGOSS). Data were collected by the US Coast Guard from 07 January 1975 to 09 January 1975. Data were processed by NODC to the NODC standard Universal Bathythermograph Output (UBT) format. Full format description is available from NODC at www.nodc.noaa.gov/General/NODC-Archive/bt.html.
The UBT file format is used for temperature-depth profile data obtained using expendable bathythermograph (XBT) instruments. Standard XBTs can obtain profiles at depths of about 450 or 760 m. With special instruments, measurements can be obtained to 1830 m. Cruise information, position, date, and time are reported for each observation. The data record comprises pairs of temperature-depth values. Unlike the MBT data file, in which temperature values are recorded at uniform 5m intervals, the XBT Data File contains temperature values at non-uniform depths. These depths are at a minimum number of points ("inflection points") required to record the temperature curve to an acceptable degree of accuracy. On output, however, the user may request temperature values either at inflection points or interpolated to uniform depth increments.
Data has been processed by NODC to the NODC standard Bathythermograph (MBT) (C128) format. The C128 format is used for temperature-depth profile data obtained using the mechanical bathythermograph (MBT) instrument. The maximum depth of MBT observations is approximately 285 m. Therefore, MBT data are useful only in studying the thermal structure of the upper layers of the ocean. Cruise information, date, position, and time are reported for each observation. The data record comprises pairs of temperature-depth values. Temperature data in this file are recorded at uniform 5 m depth intervals.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Socioeconomic dataset for analysing demand prediction of weekend markets in the city of Hamburg, Germany
In this DDLitlab funded Data Literacy student project, our goal was to predict weekend markets in the city of Hamburg and using open-source data and OpenStreetMaps in conjunction with Machine Learning Algorithms. You can find a brief article about the initial grant and our approach here : https://www.cliccs.uni-hamburg.de/about-cliccs/news/2023-news/2023-08-24-ddlitlab-event.html
This repository is intended to make our codes and visualisations openly available to the University of Hamburg students for further research. This is not to be used without citation under any circumstances and the University/authors deserve the right to withdraw consent at any time.
Please do not forget to cite our work in the event of fair use.
Organisation of our Github repository
Codes: contains the codes for the different methods deployed for data preparation,variable selection,visualisations showing the spatial characteristics of our variables, calculating indices such as correlation coefficients and machine learning methods in increasing order of complexity. City-district (Stadtteil) as the unit of analysis.
Data (uploaded datasets) : The open source data obtained for the project has been obtained from OpenStreetMaps (https://wiki.openstreetmap.org/wiki/Use_OpenStreetMap ) and Statistik Nord (https://www.statistik-nord.de/ ) . Each variable contains values for all stadtteils (city-districts) of Hamburg. The filenames are self explanatory.
The Hamburg shapefile has been obtained from Geofabrik https://www.geofabrik.de/de/data/shapefiles.html In addition to the original data uploaded in the section, we have also laid down the final data we have deployed with the algorithms, in the final final_data.csv
Our repository contains the following additional sections:
Results: This section contains results from the codes processed in the first section. It includes the final 10 variables selected for the study, the results from the VIF analysis, correlation matrix, and some model output statistics.
Visualisations: This section is dedicated to visualisations of the variables used for the study and the results from deployment of various methods. In case of any questions, please do not hesitate to contact us at our official student IDs : first.lastname@studium.uni-hamburg.de. We are also available on LinkedIn for professional networking in case of other queries.
Data curators /DDLitLab data literacy project team
Ferdinand Hölzl
Leidy Gicela Vergara Lopez
Shivanshi Asthana
Shuyue Qu
Sojung Oh
Juan Miguel Rodriguez Lopez
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Lithuania LT: Time Required to Obtain an Operating License data was reported at 42.700 Day in 2013. This records a decrease from the previous number of 65.400 Day for 2009. Lithuania LT: Time Required to Obtain an Operating License data is updated yearly, averaging 55.500 Day from Dec 2004 (Median) to 2013, with 3 observations. The data reached an all-time high of 65.400 Day in 2009 and a record low of 42.700 Day in 2013. Lithuania LT: Time Required to Obtain an Operating License data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Lithuania – Table LT.World Bank: Company Statistics. Time required to obtain operating license is the average wait to obtain an operating license from the day the establishment applied for it to the day it was granted.; ; World Bank, Enterprise Surveys (http://www.enterprisesurveys.org/).; Unweighted average;
http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/
The dataset and source code for paper "Automating Intention Mining".
The code is based on dennybritz's implementation of Yoon Kim's paper Convolutional Neural Networks for Sentence Classification.
By default, the code uses Tensorflow 0.12. Some errors might be reported when using other versions of Tensorflow due to the incompatibility of some APIs.
Running 'online_prediction.py', you can input any sentence and check the classification result produced by a pre-trained CNN model. The model uses all sentences of the four Github projects as training data.
Running 'play.py', you can get the evaluation result of cross-project prediction. Please check the code for more details of the configuration. By default, it will use the four Github projects as training data to predict the sentences in DECA dataset, and in this setting, the category 'aspect evaluation' and 'others' are dropped since DECA dataset does not contain these two categories.
Third grade English Language Arts (ELA) and Math test results for the 2016-2017 school year by census tract for the state of Michigan. Data Driven Detroit obtained these datasets from MI School Data, for the State of the Detroit Child tool in July 2017. Test results were originally obtained on a school level and aggregated to census tract by Data Driven Detroit. Student data was suppressed when less than five students were tested per school.Click here for metadata (descriptions of the fields).