77 datasets found

c
Sign Line Task & Work Order Data
s.cnmilf.com
data.lacity.org
+1more
Updated Nov 29, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.lacity.org (2021). Sign Line Task & Work Order Data [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/sign-line-task-work-order-data
Explore at:
Dataset updated
Nov 29, 2021
Dataset provided by
data.lacity.org
Description
Sign (Task & Work Order) Information from eWork.
C
sort
data.cityofchicago.org
application/rdfxml +5
Updated Mar 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chicago Police Department (2025). sort [Dataset]. https://data.cityofchicago.org/Public-Safety/sort/bnsx-zzcw
Explore at:
xml, tsv, csv, json, application/rdfxml, application/rssxmlAvailable download formats
Dataset updated
Mar 27, 2025
Authors
Chicago Police Department
Description
This dataset reflects reported incidents of crime (with the exception of murders where data exists for each victim) that occurred in the City of Chicago from 2001 to present, minus the most recent seven days. Data is extracted from the Chicago Police Department's CLEAR (Citizen Law Enforcement Analysis and Reporting) system. In order to protect the privacy of crime victims, addresses are shown at the block level only and specific locations are not identified. Should you have questions about this dataset, you may contact the Research & Development Division of the Chicago Police Department at 312.745.6071 or RandD@chicagopolice.org. Disclaimer: These crimes may be based upon preliminary information supplied to the Police Department by the reporting parties that have not been verified. The preliminary crime classifications may be changed at a later date based upon additional investigation and there is always the possibility of mechanical or human error. Therefore, the Chicago Police Department does not guarantee (either expressed or implied) the accuracy, completeness, timeliness, or correct sequencing of the information and the information should not be used for comparison purposes over time. The Chicago Police Department will not be responsible for any error or omission, or for the use of, or the results obtained from the use of this information. All data visualizations on maps should be considered approximate and attempts to derive specific addresses are strictly prohibited. The Chicago Police Department is not responsible for the content of any off-site pages that are referenced by or that reference this web page other than an official City of Chicago or Chicago Police Department web page. The user specifically acknowledges that the Chicago Police Department is not responsible for any defamatory, offensive, misleading, or illegal conduct of other users, links, or third parties and that the risk of injury from the foregoing rests entirely with the user. The unauthorized use of the words "Chicago Police Department," "Chicago Police," or any colorable imitation of these words or the unauthorized use of the Chicago Police Department logo is unlawful. This web page does not, in any way, authorize such use. Data is updated daily Tuesday through Sunday. The dataset contains more than 65,000 records/rows of data and cannot be viewed in full in Microsoft Excel. Therefore, when downloading the file, select CSV from the Export menu. Open the file in an ASCII text editor, such as Wordpad, to view and search. To access a list of Chicago Police Department - Illinois Uniform Crime Reporting (IUCR) codes, go to http://data.cityofchicago.org/Public-Safety/Chicago-Police-Department-Illinois-Uniform-Crime-R/c7ck-438e
m
Data from: A generalized R-matrix propagation program for solving coupled...
data.mendeley.com
Updated Jan 1, 1984
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lesley A. Morgan (1984). A generalized R-matrix propagation program for solving coupled second-order differential equations [Dataset]. http://doi.org/10.17632/txszxrvgbk.1
Explore at:
Unique identifier
https://doi.org/10.17632/txszxrvgbk.1
Dataset updated
Jan 1, 1984
Authors
Lesley A. Morgan
License
https://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license/cpc-license/https://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license/cpc-license/
Description
Title of program: RPROP2 Catalogue Id: AAJK_v2_0 [AAJL]

Nature of problem Coupled second-order differential equations which arise in electron collision with atoms, ions and molecules are solved over a given range of the independent variable. The R-matrix at one end of the range is calculated given the R-matrix at the other end of the range.

Versions of this program held in the CPC repository in Mendeley Data AAJK_v1_0; RPROP; 10.1016/0010-4655(82)90177-1 AAJK_v2_0; RPROP2; 10.1016/0010-4655(84)90025-0

This program has been imported from the CPC Program Library held at Queen's University Belfast (1969-2018)
D
Replication Data for: Subject Placement in the History of Latin
dataverse.azure.uit.no
dataverse.no
txt
Updated Sep 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lieven Danckaert; Lieven Danckaert (2023). Replication Data for: Subject Placement in the History of Latin [Dataset]. http://doi.org/10.18710/V9D674
Explore at:
txt(531057), txt(3449), txt(6149)Available download formats
Unique identifier
https://doi.org/10.18710/V9D674
Dataset updated
Sep 28, 2023
Dataset provided by
DataverseNO
Authors
Lieven Danckaert; Lieven Danckaert
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The present dataset was used in a corpus study on the diachrony of subject placement in the history of Latin, to appear in 'Catalan Journal of Linguistics'. The main file contains a set of Latin examples, which have all been annotated for a number of variables needed for the purpose of the study. A detailed description of the contents of this dataset is given in the README file. Finally there is a file with the R-code used to produce all the quantitative data mentioned in the paper. Below you can find the abstract of the article. Abstract The aim of this paper is to provide further support for one aspect of the analysis of Classical and Late Latin clause structure proposed in Danckaert (2017a), namely the diachrony of subject placement. According to the relevant proposal, one needs to distinguish an earlier grammar (‘Grammar A’, whose heyday is the period from ca. 200 BC until 200 AD), in which there is no A-movement for subjects, and a later grammar (‘Grammar B’, which is on the rise from ca. 50-100 AD, and fully productive from ca. 200 AD onwards), where subjects optionally move to the inflectional layer. Assuming the variationist acquisition model of language change developed in Yang (2000, 2002a,b), I present corpus evidence which confirms that it is only in the Late Latin period that TP-internal subjects fully establish themselves as a grammatical option.
stock_TESLA
kaggle.com
Updated Dec 13, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
willian oliveira gibin (2023). stock_TESLA [Dataset]. https://www.kaggle.com/datasets/willianoliveiragibin/stock-tesla
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 13, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
willian oliveira gibin
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The “Tesla Stock Price Data (Last One Year)” dataset is a comprehensive collection of historical stock market information, focusing on Tesla Inc. (TSLA) for the past year. This dataset serves as a valuable resource for financial analysts, investors, researchers, and data enthusiasts who are interested in studying the trends, patterns, and performance of Tesla’s stock in the financial markets.It consists of 9 columns referring to date, high and low prices, open and closing value, volume, cumulative open and of course changing of price.At a first glance in order to better understand the data we should plot the time series of each attribute.The cumulative Open Interest(OI) is the total open contracts that are being held in a particular Future or Call or Put contracts on the Exchange. We can see that the biggest drop of the stock happened in January of 2023 and after 5 to 6 months it regained its stock value round the summer of the same year with opening and closing price around 300.As a next step we are going to plot some more plots in order ro better understand the relation between our target column(change price) with every other attribute. In order to interpret the results:

Linear Regression:

Mean Absolute Error (MAE): 6.28 This model, on average, predicts the “Price Change” within approximately 6.28 units of the true value. Mean Squared Error (MSE): 52.97 MSE measures the average of squared differences, and this value suggests some variability in prediction errors. Root Mean Squared Error (RMSE): 7.28 RMSE is the square root of MSE and is in the same units as the target variable. An RMSE of 7.28 indicates the typical prediction error. R-squared (R2): 0.0868 R-squared represents the proportion of the variance in the target variable explained by the model. An R2 of 0.0868 suggests that the model explains only a small portion of the variance, indicating limited predictive power. Decision Tree Regression:

Mean Absolute Error (MAE): 9.21 This model, on average, predicts the “Price Change” within approximately 9.21 units of the true value, which is higher than the Linear Regression model. Mean Squared Error (MSE): 150.69 The MSE is relatively high, indicating larger prediction errors and more variability. Root Mean Squared Error (RMSE): 12.28 RMSE of 12.28 is notably higher, suggesting that this model has larger prediction errors. R-squared (R2): -1.598 The negative R-squared value indicates that the model performs worse than a horizontal line as a predictor, indicating a poor fit. Random Forest Regression:

Mean Absolute Error (MAE): 6.99 This model, on average, predicts the “Price Change” within approximately 6.99 units of the true value, similar to Linear Regression. Mean Squared Error (MSE): 62.79 MSE is lower than the Decision Tree model but higher than Linear Regression, suggesting intermediate prediction accuracy Root Mean Squared Error (RMSE): 7.92 RMSE is also intermediate, indicating moderate prediction errors. R-squared (R2): -0.0824 The negative R-squared suggests that the Random Forest model does not perform well and has limited predictive power.
Z
Film Circulation dataset
data.niaid.nih.gov
zenodo.org
Updated Jul 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Samoilova, Evgenia (Zhenya) (2024). Film Circulation dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7887671
Explore at:
Dataset updated
Jul 12, 2024
Dataset provided by
Loist, Skadi
Samoilova, Evgenia (Zhenya)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Complete dataset of “Film Circulation on the International Film Festival Network and the Impact on Global Film Culture”

A peer-reviewed data paper for this dataset is in review to be published in NECSUS_European Journal of Media Studies - an open access journal aiming at enhancing data transparency and reusability, and will be available from https://necsus-ejms.org/ and https://mediarep.org

Please cite this when using the dataset.

Detailed description of the dataset:

1 Film Dataset: Festival Programs

The Film Dataset consists a data scheme image file, a codebook and two dataset tables in csv format.

The codebook (csv file “1_codebook_film-dataset_festival-program”) offers a detailed description of all variables within the Film Dataset. Along with the definition of variables it lists explanations for the units of measurement, data sources, coding and information on missing data.

The csv file “1_film-dataset_festival-program_long” comprises a dataset of all films and the festivals, festival sections, and the year of the festival edition that they were sampled from. The dataset is structured in the long format, i.e. the same film can appear in several rows when it appeared in more than one sample festival. However, films are identifiable via their unique ID.

The csv file “1_film-dataset_festival-program_wide” consists of the dataset listing only unique films (n=9,348). The dataset is in the wide format, i.e. each row corresponds to a unique film, identifiable via its unique ID. For easy analysis, and since the overlap is only six percent, in this dataset the variable sample festival (fest) corresponds to the first sample festival where the film appeared. For instance, if a film was first shown at Berlinale (in February) and then at Frameline (in June of the same year), the sample festival will list “Berlinale”. This file includes information on unique and IMDb IDs, the film title, production year, length, categorization in length, production countries, regional attribution, director names, genre attribution, the festival, festival section and festival edition the film was sampled from, and information whether there is festival run information available through the IMDb data.

2 Survey Dataset

The Survey Dataset consists of a data scheme image file, a codebook and two dataset tables in csv format.

The codebook “2_codebook_survey-dataset” includes coding information for both survey datasets. It lists the definition of the variables or survey questions (corresponding to Samoilova/Loist 2019), units of measurement, data source, variable type, range and coding, and information on missing data.

The csv file “2_survey-dataset_long-festivals_shared-consent” consists of a subset (n=161) of the original survey dataset (n=454), where respondents provided festival run data for films (n=206) and gave consent to share their data for research purposes. This dataset consists of the festival data in a long format, so that each row corresponds to the festival appearance of a film.

The csv file “2_survey-dataset_wide-no-festivals_shared-consent” consists of a subset (n=372) of the original dataset (n=454) of survey responses corresponding to sample films. It includes data only for those films for which respondents provided consent to share their data for research purposes. This dataset is shown in wide format of the survey data, i.e. information for each response corresponding to a film is listed in one row. This includes data on film IDs, film title, survey questions regarding completeness and availability of provided information, information on number of festival screenings, screening fees, budgets, marketing costs, market screenings, and distribution. As the file name suggests, no data on festival screenings is included in the wide format dataset.

3 IMDb & Scripts

The IMDb dataset consists of a data scheme image file, one codebook and eight datasets, all in csv format. It also includes the R scripts that we used for scraping and matching.

The codebook “3_codebook_imdb-dataset” includes information for all IMDb datasets. This includes ID information and their data source, coding and value ranges, and information on missing data.

The csv file “3_imdb-dataset_aka-titles_long” contains film title data in different languages scraped from IMDb in a long format, i.e. each row corresponds to a title in a given language.

The csv file “3_imdb-dataset_awards_long” contains film award data in a long format, i.e. each row corresponds to an award of a given film.

The csv file “3_imdb-dataset_companies_long” contains data on production and distribution companies of films. The dataset is in a long format, so that each row corresponds to a particular company of a particular film.

The csv file “3_imdb-dataset_crew_long” contains data on names and roles of crew members in a long format, i.e. each row corresponds to each crew member. The file also contains binary gender assigned to directors based on their first names using the GenderizeR application.

The csv file “3_imdb-dataset_festival-runs_long” contains festival run data scraped from IMDb in a long format, i.e. each row corresponds to the festival appearance of a given film. The dataset does not include each film screening, but the first screening of a film at a festival within a given year. The data includes festival runs up to 2019.

The csv file “3_imdb-dataset_general-info_wide” contains general information about films such as genre as defined by IMDb, languages in which a film was shown, ratings, and budget. The dataset is in wide format, so that each row corresponds to a unique film.

The csv file “3_imdb-dataset_release-info_long” contains data about non-festival release (e.g., theatrical, digital, tv, dvd/blueray). The dataset is in a long format, so that each row corresponds to a particular release of a particular film.

The csv file “3_imdb-dataset_websites_long” contains data on available websites (official websites, miscellaneous, photos, video clips). The dataset is in a long format, so that each row corresponds to a website of a particular film.

The dataset includes 8 text files containing the script for webscraping. They were written using the R-3.6.3 version for Windows.

The R script “r_1_unite_data” demonstrates the structure of the dataset, that we use in the following steps to identify, scrape, and match the film data.

The R script “r_2_scrape_matches” reads in the dataset with the film characteristics described in the “r_1_unite_data” and uses various R packages to create a search URL for each film from the core dataset on the IMDb website. The script attempts to match each film from the core dataset to IMDb records by first conducting an advanced search based on the movie title and year, and then potentially using an alternative title and a basic search if no matches are found in the advanced search. The script scrapes the title, release year, directors, running time, genre, and IMDb film URL from the first page of the suggested records from the IMDb website. The script then defines a loop that matches (including matching scores) each film in the core dataset with suggested films on the IMDb search page. Matching was done using data on directors, production year (+/- one year), and title, a fuzzy matching approach with two methods: “cosine” and “osa.” where the cosine similarity is used to match titles with a high degree of similarity, and the OSA algorithm is used to match titles that may have typos or minor variations.

The script “r_3_matching” creates a dataset with the matches for a manual check. Each pair of films (original film from the core dataset and the suggested match from the IMDb website was categorized in the following five categories: a) 100% match: perfect match on title, year, and director; b) likely good match; c) maybe match; d) unlikely match; and e) no match). The script also checks for possible doubles in the dataset and identifies them for a manual check.

The script “r_4_scraping_functions” creates a function for scraping the data from the identified matches (based on the scripts described above and manually checked). These functions are used for scraping the data in the next script.

The script “r_5a_extracting_info_sample” uses the function defined in the “r_4_scraping_functions”, in order to scrape the IMDb data for the identified matches. This script does that for the first 100 films, to check, if everything works. Scraping for the entire dataset took a few hours. Therefore, a test with a subsample of 100 films is advisable.

The script “r_5b_extracting_info_all” extracts the data for the entire dataset of the identified matches.

The script “r_5c_extracting_info_skipped” checks the films with missing data (where data was not scraped) and tried to extract data one more time to make sure that the errors were not caused by disruptions in the internet connection or other technical issues.

The script “r_check_logs” is used for troubleshooting and tracking the progress of all of the R scripts used. It gives information on the amount of missing values and errors.

4 Festival Library Dataset

The Festival Library Dataset consists of a data scheme image file, one codebook and one dataset, all in csv format.

The codebook (csv file “4_codebook_festival-library_dataset”) offers a detailed description of all variables within the Library Dataset. It lists the definition of variables, such as location and festival name, and festival categories, units of measurement, data sources and coding and missing data.

The csv file “4_festival-library_dataset_imdb-and-survey” contains data on all unique festivals collected from both IMDb and survey sources. This dataset appears in wide format, all information for each festival is listed in one row. This
Integration of Slurry Separation Technology & Refrigeration Units: Air...
catalog.data.gov
datasets.ai
Updated Jun 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.usaid.gov (2024). Integration of Slurry Separation Technology & Refrigeration Units: Air Quality - H2S [Dataset]. https://catalog.data.gov/dataset/integration-of-slurry-separation-technology-refrigeration-units-air-quality-h2s-4af17
Explore at:
Dataset updated
Jun 25, 2024
Dataset provided by
United States Agency for International Developmenthttps://usaid.gov/
Description
This is the raw H2S data- concentration of H2S in parts per million in the biogas. Each sheet (tab) is formatted to be exported as a .csv for use with the R-code (AQ-June20.R). In order for this code to work properly, it is important that this file remain intact. Do not change the column names or codes for data, for example. And to be safe, don’t even sort. One simple change in the excel file could make the code full of bugs.
W
HUN AWRA-R calibration nodes v01
cloud.csiss.gmu.edu
researchdata.edu.au
+2more
zip
Updated Dec 14, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Australia (2019). HUN AWRA-R calibration nodes v01 [Dataset]. https://cloud.csiss.gmu.edu/uddi/dataset/f2da394a-3d08-4cf4-8c24-bf7751ea06a1
Explore at:
zip(11340)Available download formats
Dataset updated
Dec 14, 2019
Dataset provided by
Australia
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract

The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.

This dataset is a shapefile which is a subset for the Hunter subregion containing geographical locations and other characteristics (see below) of streamflow gauging stations. T

There are 3 files that have been extracted from the Hydstra database to aid in identifying sites in the Hunter subregion and the type of data collected from each on.

the 3 files are:

Site - lists all sites available in Hydstra from data providers. The data provider is listed in the #Station as _xxx. For example, sites in NSW are _77, QLD are _66.

Some sites do not have locational information and will not be able to be plotted.

Period - the period table lists all the variables that are recorded at each site and the period of record.

Variable - the variable table shows variable codes and names which can be linked to the period table.

Purpose

Locations are used as pour points in order to define reach areas for river system modelling.

Dataset History

Subset of data for the Hunter subregion that was extracted from the Bureau of Meteorology's hydstra system and includes all gauges where data has been received from the lead water agency of each jurisdiction. The gauges shapefile for all bioregions was intersected with the Hunter subregion boundary to identify and extract gauges within the subregion.

Dataset Citation

Bioregional Assessment Programme (2016) HUN AWRA-R calibration nodes v01. Bioregional Assessment Derived Dataset. Viewed 13 March 2019, http://data.bioregionalassessments.gov.au/dataset/f2da394a-3d08-4cf4-8c24-bf7751ea06a1.

Dataset Ancestors

Derived From Gippsland Project boundary

Derived From Bioregional Assessment areas v04

Derived From Natural Resource Management (NRM) Regions 2010

Derived From Bioregional Assessment areas v03

Derived From Victoria - Seamless Geology 2014

Derived From Bioregional Assessment areas v05

Derived From National Surface Water sites Hydstra

Derived From Bioregional Assessment areas v01

Derived From Bioregional Assessment areas v02

Derived From GEODATA TOPO 250K Series 3

Derived From NSW Catchment Management Authority Boundaries 20130917

Derived From Geological Provinces - Full Extent

Derived From GEODATA TOPO 250K Series 3, File Geodatabase format (.gdb)
Data from: Order Hymenoptera, family Formicidae
gbif.org
bionomia.net
+3more
Updated Nov 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cedric A. Collingwood; Donat Agosti; Mostafa R. Sharaf; Antonius van Harten; Cedric A. Collingwood; Donat Agosti; Mostafa R. Sharaf; Antonius van Harten (2024). Order Hymenoptera, family Formicidae [Dataset]. http://doi.org/10.5281/zenodo.1168586
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.1168586
Dataset updated
Nov 26, 2024
Dataset provided by
Plazi
Global Biodiversity Information Facilityhttps://www.gbif.org/
Authors
Cedric A. Collingwood; Donat Agosti; Mostafa R. Sharaf; Antonius van Harten; Cedric A. Collingwood; Donat Agosti; Mostafa R. Sharaf; Antonius van Harten
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This dataset contains the digitized treatments in Plazi based on the original journal article Cedric A. Collingwood, Donat Agosti, Mostafa R. Sharaf, Antonius van Harten (2011): Order Hymenoptera, family Formicidae. Arthropod fauna of the UAE 4: 1-70, DOI: 10.5281/zenodo.1168586
D
Replication Data for: A Corpus Based Analysis of V2 Variation in West...
dataverse.no
dataverse.azure.uit.no
+1more
bin, csv +2
Updated Sep 28, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chloé Lybaert; Chloé Lybaert; Bernard De Clerck; Bernard De Clerck; Jorien Saelens; Ludovic De Cuypere; Ludovic De Cuypere; Jorien Saelens (2023). Replication Data for: A Corpus Based Analysis of V2 Variation in West Flemish and French Flemish Dialects [Dataset]. http://doi.org/10.18710/NSFN2B
Explore at:
csv(823), txt(15549), text/comma-separated-values(93006), csv(85373), bin(14055)Available download formats
Unique identifier
https://doi.org/10.18710/NSFN2B
Dataset updated
Sep 28, 2023
Dataset provided by
DataverseNO
Authors
Chloé Lybaert; Chloé Lybaert; Bernard De Clerck; Bernard De Clerck; Jorien Saelens; Ludovic De Cuypere; Ludovic De Cuypere; Jorien Saelens
License
https://dataverse.no/api/datasets/:persistentId/versions/1.2/customlicense?persistentId=doi:10.18710/NSFN2Bhttps://dataverse.no/api/datasets/:persistentId/versions/1.2/customlicense?persistentId=doi:10.18710/NSFN2B
Time period covered
1960 - 1970
Area covered
French, Belgium, Oost-Vlaanderen, France, Département du Nord - Département 59, West-Vlaanderen, Belgium
Description
Dataset abstract The dataset includes an annotated dataset of N = 1413 sentences (or parts thereof) taken from an authentic spoken corpus data from West Flemish and French Flemish (Dialects of Dutch). The sentences are annotated for V2 variation (Subject-Verb inversion, the outcome variable of the associated study) and seven predictor variables, including city, region, prosodic integration, form and function of the topicalized constituent, form of the subject, and the number of constituents in the prefield on (non)inverted word order. The dataset also includes geographical data to create a dialect map showing the relative frequencies of V2 variation. An R Notebook with the data analysis is provided. Article abstract This paper explores V2 variation in West Flemish and French Flemish dialects of Dutch based on an extensive corpus of authentic spoken data. After taking stock of the existing literature, we probe into the effect of region, prosodic integration, form and function of the topicalized constituent, form of the subject, and the number of constituents in the prefield on (non)inverted word order. This is the first study that carries out regression analysis on the combined impact of these variables in the entire West Flemish and French Flemish region, with additional visualization of effect sizes. The results show that noninversion is generally more widespread than originally anticipated, with unexpected higher occurrence of noninversion in continental West Flemish and lower frequencies in western West Flemish. With the exception of the variable number of constituents in the prefield, all other variables had a significant impact on word order: Clausal topicalized elements, elements that have peripheral functions, and elements that lack prosodic integration all favor noninverted word order. The form of the subject also impacted word order, but its effect is sometimes overruled by discourse considerations.
GAL Surface Water Reaches for Risk and Impact Analysis 20180803
researchdata.edu.au
data.gov.au
+1more
Updated Dec 7, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bioregional Assessment Program (2018). GAL Surface Water Reaches for Risk and Impact Analysis 20180803 [Dataset]. https://researchdata.edu.au/gal-surface-water-analysis-20180803/2989417
Explore at:
Dataset updated
Dec 7, 2018
Dataset provided by
Data.govhttps://data.gov/
Authors
Bioregional Assessment Program
License
Attribution 2.5 (CC BY 2.5)https://creativecommons.org/licenses/by/2.5/
License information was derived automatically
Description
Abstract \r

\r The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement.\r \r The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.\r \r

Dataset History \r

\r Stream network constructed and defined using datasets shown in the Lineage.\r \r The stream network constructed using surface water nodes to define reaches and the the classification was assigned by using the data from the stream network from the lineage and then assigned the following classfication: \r \r 1.\tsurface water change due to hydrology\r \r 2.\tno change modelled at link node within PAE\r \r 3.\tmodelled no change at link node\r \r 4.\tmodelled change at link node\r \r 5. assumed change due to proximity to mine pit\r \r 6. assumed change due to hydrology\r \r Further tie-breaks were decide based on stream order or stream segment length.\r \r

Dataset Citation \r

\r Bioregional Assessment Programme (2017) GAL Surface Water Reaches for Risk and Impact Analysis 20180803. Bioregional Assessment Derived Dataset. Viewed 12 December 2018, http://data.bioregionalassessments.gov.au/dataset/64c4d16f-bdfa-4fd6-bd72-c459503003bd.\r \r

Dataset Ancestors \r

\r * Derived From Onsite and offsite mine infrastructure for the Carmichael Coal Mine and Rail Project, Adani Mining Pty Ltd 2012\r \r * Derived From Alpha Coal Project Environmental Impact Statement\r \r * Derived From Geofabric Surface Cartography - V2.1\r \r * Derived From QLD Exploration and Production Tenements (20140728)\r \r * Derived From China Stone Coal Project initial advice statement\r \r * Derived From Kevin's Corner Project Environmental Impact Statement\r \r * Derived From Galilee surface water modelling nodes\r \r * Derived From Geoscience Australia GEODATA TOPO series - 1:1 Million to 1:10 Million scale\r \r * Derived From China First Galilee Coal Project Environmental Impact Assessment\r \r * Derived From GEODATA TOPO 250K Series 3\r \r * Derived From Seven coal mines included in Galilee surface water modelling\r \r
d
HUN AWRA-R Gauge Station Cross Sections v01
data.gov.au
researchdata.edu.au
+2more
zip
Updated Nov 20, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bioregional Assessment Program (2019). HUN AWRA-R Gauge Station Cross Sections v01 [Dataset]. https://data.gov.au/data/dataset/activity/93fbc2b9-463c-42f6-8817-cb45a54ee28e
Explore at:
zip(12360)Available download formats
Dataset updated
Nov 20, 2019
Dataset provided by
Bioregional Assessment Program
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract

This dataset was supplied to the Bioregional Assessment Programme by a third party and is presented here as originally supplied. The metadata was not provided by the data supplier and has been compiled by the programme based on known details.

River cross sections for selected gauging stations in the Hunter subregion extracted from the PINNEENA CM-version 10.2 on DVD released in May 2014. The data is in a comma separated file (CSV). The relevant information contained in the CSV is as follows: Site, cross section ID, order (measurement), chain and level. This dataset was supplied to the Bioregional Assessment Programme by a third party and are presented here as originally supplied. Metadata was not provided and has been compiled by the Bioregional Assessment Programme based on known details at the time of acquisition.

Purpose

The cross-sections are used in river modelling to determine river reach volumes and gains (e.g. rainfall on river) and losses (e.g. leakage to groundwater, evapotranspiration from river).

Dataset History

This dataset was supplied to the Bioregional Assessment Programme by the New South Wales Office of Water through the PINNEENA CM-version 10.2 on DVD released in May 2014, and are presented here as originally supplied.

Dataset Citation

Bioregional Assessment Programme (2016) HUN AWRA-R Gauge Station Cross Sections v01. Bioregional Assessment Source Dataset. Viewed 13 March 2019, http://data.bioregionalassessments.gov.au/dataset/93fbc2b9-463c-42f6-8817-cb45a54ee28e.
iEEG-Multicenter-Dataset
openneuro.org
Updated Dec 2, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Adam Li; Sara Inati; Kareem Zaghloul; Nathan Crone; William Anderson; Emily Johnson; Iahn Cajigas; Damian Brusko; Jonathan Jagid; Angel Claudio; Andres Kanner; Jennifer Hopp; Stephanie Chen; Jennifer Haagensen; Sridevi Sarma (2020). iEEG-Multicenter-Dataset [Dataset]. http://doi.org/10.18112/openneuro.ds003029.v1.0.1
Explore at:
Unique identifier
https://doi.org/10.18112/openneuro.ds003029.v1.0.1
Dataset updated
Dec 2, 2020
Dataset provided by
OpenNeurohttps://openneuro.org/
Authors
Adam Li; Sara Inati; Kareem Zaghloul; Nathan Crone; William Anderson; Emily Johnson; Iahn Cajigas; Damian Brusko; Jonathan Jagid; Angel Claudio; Andres Kanner; Jennifer Hopp; Stephanie Chen; Jennifer Haagensen; Sridevi Sarma
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Fragility Multi-Center Retrospective Study

iEEG and EEG data from 5 centers is organized in our study with a total of 100 subjects. We publish 4 centers' dataset here due to data sharing issues.

Acquisitions include ECoG and SEEG. Each run specifies a different snapshot of EEG data from that specific subject's session. For seizure sessions, this means that each run is a EEG snapshot around a different seizure event.

For additional clinical metadata about each subject, refer to the clinical Excel table in the publication.

Data Availability

NIH, JHH, UMMC, and UMF agreed to share. Cleveland Clinic did not, so requires an additional DUA.

All data, except for Cleveland Clinic was approved by their centers to be de-identified and shared. All data in this dataset have no PHI, or other identifiers associated with patient. In order to access Cleveland Clinic data, please forward all requests to Amber Sours, SOURSA@ccf.org:

Amber Sours, MPH Research Supervisor | Epilepsy Center Cleveland Clinic | 9500 Euclid Ave. S3-399 | Cleveland, OH 44195 (216) 444-8638

You will need to sign a data use agreement (DUA).

Sourcedata

For each subject, there was a raw EDF file, which was converted into the BrainVision format with mne_bids. Each subject with SEEG implantation, also has an Excel table, called electrode_layout.xlsx, which outlines where the clinicians marked each electrode anatomically. Note that there is no rigorous atlas applied, so the main points of interest are: WM, GM, VENTRICLE, CSF, and OUT, which represent white-matter, gray-matter, ventricle, cerebrospinal fluid and outside the brain. WM, Ventricle, CSF and OUT were removed channels from further analysis. These were labeled in the corresponding BIDS channels.tsv sidecar file as status=bad. The dataset uploaded to openneuro.org does not contain the sourcedata since there was an extra anonymization step that occurred when fully converting to BIDS.

Derivatives

Derivatives include: * fragility analysis * frequency analysis * graph metrics analysis * figures

These can be computed by following the following paper: Neural Fragility as an EEG Marker for the Seizure Onset Zone

Events and Descriptions

Within each EDF file, there contain event markers that are annotated by clinicians, which may inform you of specific clinical events that are occuring in time, or of when they saw seizures onset and offset (clinical and electrographic).

During a seizure event, specifically event markers may follow this time course:

* eeg onset, or clinical onset - the onset of a seizure that is either marked electrographically, or by clinical behavior. Note that the clinical onset may not always be present, since some seizures manifest without clinical behavioral changes. * Marker/Mark On - these are usually annotations within some cases, where a health practitioner injects a chemical marker for use in ICTAL SPECT imaging after a seizure occurs. This is commonly done to see which portions of the brain are active metabolically. * Marker/Mark Off - This is when the ICTAL SPECT stops imaging. * eeg offset, or clinical offset - this is the offset of the seizure, as determined either electrographically, or by clinical symptoms.

Other events included may be beneficial for you to understand the time-course of each seizure. Note that ICTAL SPECT occurs in all Cleveland Clinic data. Note that seizure markers are not consistent in their description naming, so one might encode some specific regular-expression rules to consistently capture seizure onset/offset markers across all dataset. In the case of UMMC data, all onset and offset markers were provided by the clinicians on an Excel sheet instead of via the EDF file. So we went in and added the annotations manually to each EDF file.

Seizure Electrographic and Clinical Onset Annotations

For various datasets, there are seizures present within the dataset. Generally there is only one seizure per EDF file. When seizures are present, they are marked electrographically (and clinically if present) via standard approaches in the epilepsy clinical workflow.

Clinical onset are just manifestation of the seizures with clinical syndromes. Sometimes the maker may not be present.

Seizure Onset Zone Annotations

What is actually important in the evaluation of datasets is the clinical annotations of their localization hypotheses of the seizure onset zone.

These generally include:

* early onset: the earliest onset electrodes participating in the seizure that clinicians saw * early/late spread (optional): the electrodes that showed epileptic spread activity after seizure onset. Not all seizures has spread contacts annotated.

Surgical Zone (Resection or Ablation) Annotations

For patients with the post-surgical MRI available, then the segmentation process outlined above tells us which electrodes were within the surgical removed brain region.

Otherwise, clinicians give us their best estimate, of which electrodes were resected/ablated based on their surgical notes.

For surgical patients whose postoperative medical records did not explicitly indicate specific resected or ablated contacts, manual visual inspection was performed to determine the approximate contacts that were located in later resected/ablated tissue. Postoperative T1 MRI scans were compared against post-SEEG implantation CT scans or CURRY coregistrations of preoperative MRI/post SEEG CT scans. Contacts of interest in and around the area of the reported resection were selected individually and the corresponding slice was navigated to on the CT scan or CURRY coregistration. After identifying landmarks of that slice (e.g. skull shape, skull features, shape of prominent brain structures like the ventricles, central sulcus, superior temporal gyrus, etc.), the location of a given contact in relation to these landmarks, and the location of the slice along the axial plane, the corresponding slice in the postoperative MRI scan was navigated to. The resected tissue within the slice was then visually inspected and compared against the distinct landmarks identified in the CT scans, if brain tissue was not present in the corresponding location of the contact, then the contact was marked as resected/ablated. This process was repeated for each contact of interest.

References

Adam Li, Chester Huynh, Zachary Fitzgerald, Iahn Cajigas, Damian Brusko, Jonathan Jagid, Angel Claudio, Andres Kanner, Jennifer Hopp, Stephanie Chen, Jennifer Haagensen, Emily Johnson, William Anderson, Nathan Crone, Sara Inati, Kareem Zaghloul, Juan Bulacio, Jorge Gonzalez-Martinez, Sridevi V. Sarma. Neural Fragility as an EEG Marker of the Seizure Onset Zone. bioRxiv 862797; doi: https://doi.org/10.1101/862797

Appelhoff, S., Sanderson, M., Brooks, T., Vliet, M., Quentin, R., Holdgraf, C., Chaumon, M., Mikulan, E., Tavabi, K., Höchenberger, R., Welke, D., Brunner, C., Rockhill, A., Larson, E., Gramfort, A. and Jas, M. (2019). MNE-BIDS: Organizing electrophysiological data into the BIDS format and facilitating their analysis. Journal of Open Source Software 4: (1896). https://doi.org/10.21105/joss.01896

Holdgraf, C., Appelhoff, S., Bickel, S., Bouchard, K., D'Ambrosio, S., David, O., … Hermes, D. (2019). iEEG-BIDS, extending the Brain Imaging Data Structure specification to human intracranial electrophysiology. Scientific Data, 6, 102. https://doi.org/10.1038/s41597-019-0105-7

Pernet, C. R., Appelhoff, S., Gorgolewski, K. J., Flandin, G., Phillips, C., Delorme, A., Oostenveld, R. (2019). EEG-BIDS, an extension to the brain imaging data structure for electroencephalography. Scientific Data, 6, 103. https://doi.org/10.1038/s41597-019-0104-8
D
Replication data for: Old Church Slavonic byti Part One and Part Two
dataverse.no
search.dataone.org
pdf +2
Updated Sep 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hanne M. Eckhoff; Tore Nesset; Laura A. Janda; Hanne M. Eckhoff; Tore Nesset; Laura A. Janda (2023). Replication data for: Old Church Slavonic byti Part One and Part Two [Dataset]. http://doi.org/10.18710/P9REAV
Explore at:
text/plain; charset=us-ascii(1162043), text/plain; charset=utf-8(5405111), pdf(129969), text/plain; charset=utf-8(5405110), text/plain; charset=utf-8(3035), text/plain; charset=utf-8(2673), text/plain; charset=utf-8(14170), text/plain; charset=utf-8(5674)Available download formats
Unique identifier
https://doi.org/10.18710/P9REAV
Dataset updated
Sep 28, 2023
Dataset provided by
DataverseNO
Authors
Hanne M. Eckhoff; Tore Nesset; Laura A. Janda; Hanne M. Eckhoff; Tore Nesset; Laura A. Janda
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Eastern Europe, Norway
Dataset funded by
The Research Council of Norway
Description
Abstract Part One. There is controversy over whether byti ‘be’ in Old Church Slavonic functioned as an imperfective verb with an unusually large number of inflected forms or as an aspectual pair of verbs, reflecting its suppletive origin from two stems (es- and bū-). We offer an objective empirical approach to the status of this verb, using statistical analysis of 2,428 attestations of byti in comparison with 9,694 attestations of 129 other verbs. This makes it possible to accurately locate byti in the context of the verbal lexicon of Old Church Slavonic. The comparison is made via grammatical profiles, a method that examines the frequency distribution of each verb’s inflected forms. This comparison is undertaken in two rounds, one assuming that byti is a single verb, and the other assuming that it is a pair of verbs. Both assumptions yield reasonable results, and although the grammatical profile analyses do not suffice to solve the controversy, they lay the groundwork for further analysis in Part Two that argues for a single-verb interpretation of byti. Data and R Scripts Part One: The Dat a Our analysis uses two datasets, one that presents the forms of byti as a single paradigm, verbs.csv, and one that presents it as a pair of verbs, splitverbs.csv. The R Scripts In order to represent the Church Slavonic orthography, you will need our transliteration script: translit.r. This script is sourced by the scripts for our analysis which present byti as either a single verb or a verb pair: PartOneSingleVerb.r and PartOneVerbPair.r . This script performs all of the steps for the analysis in our article and generates the plots. Abstract Part Two: The verb byti ‘be’ in Old Church Slavonic appears in an unusually rich inventory of grammatical constructions that it appears in. We analyze corpus data on the distribution of constructions in order to assess the status of this verb as either a single verb or an aspectual pair of verbs. Our study moves beyond a strict structuralist interpretation of the behavior of byti, instead recognizing the real variation and ambiguity in the data. Our findings make both theoretical and descriptive advances. The radial category structure is a central tenet of cognitive linguistics, but until now such structures have usually been posited by researchers based on their qualitative insights from data. We show that it is possible to identify both the nodes and the structure of a radial category statistically, using only linguistic data as input. We provide an enhanced description of byti that clearly distinguishes between core uses and those that are more peripheral and shows the relationships among them. While we find some evidence in support of an aspectual pair, most evidence points instead toward a single verb. Data and R Script Part Two: The Data The dataset used in this analysis is frames.csv. The R Script The R script used in this analysis is PartTwo.r.
d
Replication Data for: Understanding ‘many’ through the lens of Ukrainian...
search-demo.dataone.org
dataverse.no
+1more
Updated Sep 25, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Janda, Laura Alexis (2024). Replication Data for: Understanding ‘many’ through the lens of Ukrainian багато [Dataset]. http://doi.org/10.18710/Y7VGQE
Explore at:
Unique identifier
https://doi.org/10.18710/Y7VGQE
Dataset updated
Sep 25, 2024
Dataset provided by
DataverseNO
Authors
Janda, Laura Alexis
Time period covered
Jan 1, 1742 - Jan 1, 2023
Description
Dataset description: The General Regionally Annotated Corpus of Ukrainian (GRAC, Shvedova et al. 2017-2024, uacorpus.org) was consulted to collect data for further analysis concerning the distribution of Singular vs. Plural verb forms in the target bahato construction. GRAC is a Sketch Engine corpus of over 1.8 billion words, representing texts from over 30,000 authors created between 1816 and 2023. This corpus is designed to serve as source material for linguistic research on Standard Ukrainian. Our data was collected during the month of February 2024. We extracted and annotated 28,491 examples of the bahato construction. An additional set of examples was collected from the Russian National Corpus (ruscorpora.ru) during the month of August 2024 to provide comparison with the Russian mnogo construction. For this purpose, 6,612 examples were extracted and annotated for word order and Singular vs. Plural verb agreement. Both the Ukrainian and the Russian data are included in this dataset, along with the R scripts used to analyze this data. Article abstract: We reveal an ongoing language change in Ukrainian involving a construction with a subject comprised of the indefinite quantifier багато ‘many’ modifying a noun phrase in the Genitive Plural. Number agreement on the verb varies, allowing both Singular (in 69.1% of attestations) and Plural (in 30.9% of attestations). Based on statistical analysis of corpus data, we investigate the influence of the factors of year of creation, word order of subject and verb, and animacy of the subject on the choice of verb number. We find that, while all combinations of word order and animacy are robustly attested, VS word order and inanimate subjects tend to prefer Singular, whereas SV word order and animate subjects tend to prefer Plural. Since about the 1950s, the proportion of Plural has been increasing, overtaking Singular in the current decade. We propose that this Singular vs. Plural variation is motivated by the human embodied experience of construing a group of items as either a homogeneous mass (and therefore Singular) or a multiplicity of individuals (and therefore Plural). This proposal is supported by the identification of micro-constructions that prefer Singular and show reduced individuation of human beings.
o
Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program...
openicpsr.org
Updated May 18, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jacob Kaplan (2018). Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program Data: Hate Crime Data 1991-2019 [Dataset]. http://doi.org/10.3886/E103500V7
Explore at:
Unique identifier
https://doi.org/10.3886/E103500V7
Dataset updated
May 18, 2018
Dataset provided by
University of Pennsylvania
Authors
Jacob Kaplan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
1991 - 2019
Area covered
United States
Description
!!!WARNING~~~This dataset has a large number of flaws and is unable to properly answer many questions that people generally use it to answer, such as whether national hate crimes are changing (or at least they use the data so improperly that they get the wrong answer). A large number of people using this data (academics, advocates, reporting, US Congress) do so inappropriately and get the wrong answer to their questions as a result. Indeed, many published papers using this data should be retracted. Before using this data I highly recommend that you thoroughly read my book on UCR data, particularly the chapter on hate crimes (https://ucrbook.com/hate-crimes.html) as well as the FBI's own manual on this data. The questions you could potentially answer well are relatively narrow and generally exclude any causal relationships. ~~~WARNING!!!Version 8 release notes:Adds 2019 dataVersion 7 release notes:Changes release notes description, does not change data.Version 6 release notes:Adds 2018 dataVersion 5 release notes:Adds data in the following formats: SPSS, SAS, and Excel.Changes project name to avoid confusing this data for the ones done by NACJD.Adds data for 1991.Fixes bug where bias motivation "anti-lesbian, gay, bisexual, or transgender, mixed group (lgbt)" was labeled "anti-homosexual (gay and lesbian)" prior to 2013 causing there to be two columns and zero values for years with the wrong label.All data is now directly from the FBI, not NACJD. The data initially comes as ASCII+SPSS Setup files and read into R using the package asciiSetupReader. All work to clean the data and save it in various file formats was also done in R. Version 4 release notes: Adds data for 2017.Adds rows that submitted a zero-report (i.e. that agency reported no hate crimes in the year). This is for all years 1992-2017. Made changes to categorical variables (e.g. bias motivation columns) to make categories consistent over time. Different years had slightly different names (e.g. 'anti-am indian' and 'anti-american indian') which I made consistent. Made the 'population' column which is the total population in that agency. Version 3 release notes: Adds data for 2016.Order rows by year (descending) and ORI.Version 2 release notes: Fix bug where Philadelphia Police Department had incorrect FIPS county code. The Hate Crime data is an FBI data set that is part of the annual Uniform Crime Reporting (UCR) Program data. This data contains information about hate crimes reported in the United States. Please note that the files are quite large and may take some time to open.Each row indicates a hate crime incident for an agency in a given year. I have made a unique ID column ("unique_id") by combining the year, agency ORI9 (the 9 character Originating Identifier code), and incident number columns together. Each column is a variable related to that incident or to the reporting agency. Some of the important columns are the incident date, what crime occurred (up to 10 crimes), the number of victims for each of these crimes, the bias motivation for each of these crimes, and the location of each crime. It also includes the total number of victims, total number of offenders, and race of offenders (as a group). Finally, it has a number of columns indicating if the victim for each offense was a certain type of victim or not (e.g. individual victim, business victim religious victim, etc.). The only changes I made to the data are the following. Minor changes to column names to make all column names 32 characters or fewer (so it can be saved in a Stata format), made all character values lower case, reordered columns. I also generated incident month, weekday, and month-day variables from the incident date variable included in the original data.
threads-stack-overflow
zenodo.org
json
Updated Dec 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nicholas Landry; Nicholas Landry (2023). threads-stack-overflow [Dataset]. http://doi.org/10.5281/zenodo.10373328
Explore at:
jsonAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10373328
Dataset updated
Dec 16, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Nicholas Landry; Nicholas Landry
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Overview

This is a temporal higher-order network dataset, which here means a sequence of timestamped hyperedges where each hyperedge is a set of nodes. In this dataset, nodes are users on stackoverflow.com, and a hyperedge comes from users participating in a thread that lasts for at most 24 hours. The timestamps are the time of the post, but normalized so that the earliest post starts at 0.

Source of original data

Source: threads-stack-overflow dataset

References

If you use this data, please cite the following paper:

Simplicial closure and higher-order link prediction. Austin R. Benson, Rediet Abebe, Michael T. Schaub, Ali Jadbabaie, and Jon Kleinberg. Proceedings of the National Academy of Sciences (PNAS), 2018.
d
Replication Data for: Together and apart: Perfective verbs with a prefix and...
search.dataone.org
dataverse.harvard.edu
+2more
Updated Jul 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nordrum, Maria (2024). Replication Data for: Together and apart: Perfective verbs with a prefix and the semelfactive suffix -nu- in Contemporary Standard Russian [Dataset]. https://search.dataone.org/view/sha256%3Aa893825028302f869d69c11adfd09f9b1fe2d1d0246f1f5083baa2192aa71610
Explore at:
Dataset updated
Jul 29, 2024
Dataset provided by
DataverseNO
Authors
Nordrum, Maria
Time period covered
Jan 1, 1950 - Jan 1, 2017
Description
This dataset includes all the data files that were used for the studies in my PhD dissertation "Together and apart: Perfective verbs with a prefix and the semelfactive suffix -nu- in Contemporary Standard Russian" (2019). Most of the files involve data tables with annotated corpus data from the Russian National Corpus (www.ruscorpora.ru). Every tabular data file is available in two formats: .csv and .xlsx. The data files are numbered so that they are shown in the same order as they become relevant in the dissertation, and their contents are described in the ReadMe files with the same number. In addition, the dataset contains the R script and .txt file that was used in order to make the plot in Chapter 2., The dissertation investigates Russian perfective verbs with a prefix and the semelfactive suffix -nu-, such as zaxlopnut’ ‘slam shut’, for the sake of simplicity referred to as “Pref-Nu verbs”. Pref-Nu verbs have received only marginal attention in the scholarly literature, and in the dissertation a main goal is therefore to shed light on their distribution and productivity in Contemporary Standard Russian, as well as the type of verb clusters and semantic classes they represent. A second main goal is to compare the behavior of Pref-Nu verbs with the behavior of perfective verbs that have only one of the two relevant affixes, i.e. only a prefix, such as zaxlopat’ ‘begin to slam’, or only the semelfactive suffix -nu , such as xlopnut’ ‘slam, clap, bang once’. All of these questions are explored with data from the Russian National Corpus (years 1950-2017), and, as a general tendency, Pref-Nu verbs are found to differ from the other two verb types in that they express a single “quantum” of an action that yields some result. The choice between Pref-Nu verbs and other perfectives is furthermore explored through an informant experiment that focuses on cases of near-synonymy between related verbs. The results of the experiment indicate that none of the verbs are fully synonymous, although their semantic differences are often subtle.
m
Data for: A systematic review showed no performance benefit of machine...
data.mendeley.com
search.datacite.org
Updated Mar 14, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ben Van Calster (2019). Data for: A systematic review showed no performance benefit of machine learning over logistic regression for clinical prediction models [Dataset]. http://doi.org/10.17632/sypyt6c2mc.1
Explore at:
Unique identifier
https://doi.org/10.17632/sypyt6c2mc.1
Dataset updated
Mar 14, 2019
Authors
Ben Van Calster
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The uploaded files are:

1) Excel file containing 6 sheets in respective Order: "Data Extraction" (summarized final data extractions from the three reviewers involved), "Comparison Data" (data related to the comparisons investigated), "Paper level data" (summaries at paper level), "Outcome Event Data" (information with respect to number of events for every outcome investigated within a paper), "Tuning Classification" (data related to the manner of hyperparameter tuning of Machine Learning Algorithms).

2) R script used for the Analysis (In order to read the data, please: Save "Comparison Data", "Paper level data", "Outcome Event Data" Excel sheets as txt files. In the R script srpap: Refers to the "Paper level data" sheet, srevents: Refers to the "Outcome Event Data" sheet and srcompx: Refers to " Comparison data Sheet".

3) Supplementary Material: Including Search String, Tables of data, Figures

4) PRISMA checklist items
g
NOAA GOES-R Series Geostationary Lightning Mapper (GLM) Level 0 Data
gimi9.com
catalog.data.gov
Updated Sep 19, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). NOAA GOES-R Series Geostationary Lightning Mapper (GLM) Level 0 Data [Dataset]. https://gimi9.com/dataset/data-gov_noaa-goes-r-series-geostationary-lightning-mapper-glm-level-0-data1
Explore at:
Dataset updated
Sep 19, 2023
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This data collection consists of archived Geostationary Operational Environmental Satellite-R (GOES-R) Series Geostationary Lightning Mapper (GLM) Level 0 data from the GOES-East and GOES-West satellites in the operational (OPS) and the post-launch test (PLT) phases. The GOES-R Series provides continuity of the GOES mission through 2035 and improvements in geostationary satellite observational data. GOES-16, the first GOES-R satellite, began operating as GOES-East on December 18, 2017. GOES-17 began operating as GOES-West on February 12, 2019. GOES-T launched on March 1, 2022, and was renamed to GOES-18 on March 14, 2022. GOES-U, the final satellite in the series, is scheduled to launch in 2024. GLM is a near-infrared optical transient detector observing the Western Hemisphere. The GLM Level 0 data are composed of Consultative Committee for Space Data Systems (CCSDS) packets containing the science, housekeeping, engineering, and diagnostic telemetry data downlinked from the instrument. The Level 0 data files also contain orbit and attitude/angular rate packets generated by the GOES spacecraft. Each CCSDS packet contains a unique Application Process Identifier (APID) in the primary header that identifies the specific type of packet, and is used to support interpretation of its contents. Users may refer to the GOES-R Series Product Definition and Usersâ€™ Guide (PUG) Volume 1 (Main) and Volume 2 (Level 0 Products) for Level 0 data documentation. Related instrument calibration data and Level 1b processing information are archived and available for order at the NOAA CLASS website. The GLM Level 0 data files are delivered in a netCDF-4 file format, however, the constituent CCSDS packets are stored in a byte array making the data opaque for standard netCDF reader applications. The GLM Level 0 data files are packaged in hourly tar files (data bundles) by satellite for the archive. Recently ingested archive tar files are available for 14 days on an anonymous FTP server for users to download. Data archived on offline tape may be requested from NCEI.

Facebook

Twitter

Click to copy link

Link copied

Cite

data.lacity.org (2021). Sign Line Task & Work Order Data [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/sign-line-task-work-order-data

Sign Line Task & Work Order Data

Explore at:

Dataset updated

Nov 29, 2021

Dataset provided by

data.lacity.org

Description

Sign (Task & Work Order) Information from eWork.

Clear search

Close search

Google apps

Main menu

Sign Line Task & Work Order Data

sort

Data from: A generalized R-matrix propagation program for solving coupled...

Replication Data for: Subject Placement in the History of Latin

stock_TESLA

Film Circulation dataset

Integration of Slurry Separation Technology & Refrigeration Units: Air...

HUN AWRA-R calibration nodes v01

Abstract

Purpose

Dataset History

Dataset Citation

Dataset Ancestors

Data from: Order Hymenoptera, family Formicidae

Replication Data for: A Corpus Based Analysis of V2 Variation in West...

GAL Surface Water Reaches for Risk and Impact Analysis 20180803

Abstract \r

Dataset History \r

Dataset Citation \r

Dataset Ancestors \r

HUN AWRA-R Gauge Station Cross Sections v01

Abstract

Purpose

Dataset History

Dataset Citation

iEEG-Multicenter-Dataset

Fragility Multi-Center Retrospective Study

Data Availability

Sourcedata

Derivatives

Events and Descriptions

Seizure Electrographic and Clinical Onset Annotations

Seizure Onset Zone Annotations

Surgical Zone (Resection or Ablation) Annotations

References

Replication data for: Old Church Slavonic byti Part One and Part Two

Replication Data for: Understanding ‘many’ through the lens of Ukrainian...

Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program...

threads-stack-overflow

Overview

Source of original data

References

Replication Data for: Together and apart: Perfective verbs with a prefix and...

Data for: A systematic review showed no performance benefit of machine...

NOAA GOES-R Series Geostationary Lightning Mapper (GLM) Level 0 Data

Sign Line Task & Work Order Data