77 datasets found
  1. c

    Sign Line Task & Work Order Data

    • s.cnmilf.com
    • data.lacity.org
    • +1more
    Updated Nov 29, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.lacity.org (2021). Sign Line Task & Work Order Data [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/sign-line-task-work-order-data
    Explore at:
    Dataset updated
    Nov 29, 2021
    Dataset provided by
    data.lacity.org
    Description

    Sign (Task & Work Order) Information from eWork.

  2. C

    sort

    • data.cityofchicago.org
    application/rdfxml +5
    Updated Mar 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chicago Police Department (2025). sort [Dataset]. https://data.cityofchicago.org/Public-Safety/sort/bnsx-zzcw
    Explore at:
    xml, tsv, csv, json, application/rdfxml, application/rssxmlAvailable download formats
    Dataset updated
    Mar 27, 2025
    Authors
    Chicago Police Department
    Description

    This dataset reflects reported incidents of crime (with the exception of murders where data exists for each victim) that occurred in the City of Chicago from 2001 to present, minus the most recent seven days. Data is extracted from the Chicago Police Department's CLEAR (Citizen Law Enforcement Analysis and Reporting) system. In order to protect the privacy of crime victims, addresses are shown at the block level only and specific locations are not identified. Should you have questions about this dataset, you may contact the Research & Development Division of the Chicago Police Department at 312.745.6071 or RandD@chicagopolice.org. Disclaimer: These crimes may be based upon preliminary information supplied to the Police Department by the reporting parties that have not been verified. The preliminary crime classifications may be changed at a later date based upon additional investigation and there is always the possibility of mechanical or human error. Therefore, the Chicago Police Department does not guarantee (either expressed or implied) the accuracy, completeness, timeliness, or correct sequencing of the information and the information should not be used for comparison purposes over time. The Chicago Police Department will not be responsible for any error or omission, or for the use of, or the results obtained from the use of this information. All data visualizations on maps should be considered approximate and attempts to derive specific addresses are strictly prohibited. The Chicago Police Department is not responsible for the content of any off-site pages that are referenced by or that reference this web page other than an official City of Chicago or Chicago Police Department web page. The user specifically acknowledges that the Chicago Police Department is not responsible for any defamatory, offensive, misleading, or illegal conduct of other users, links, or third parties and that the risk of injury from the foregoing rests entirely with the user. The unauthorized use of the words "Chicago Police Department," "Chicago Police," or any colorable imitation of these words or the unauthorized use of the Chicago Police Department logo is unlawful. This web page does not, in any way, authorize such use. Data is updated daily Tuesday through Sunday. The dataset contains more than 65,000 records/rows of data and cannot be viewed in full in Microsoft Excel. Therefore, when downloading the file, select CSV from the Export menu. Open the file in an ASCII text editor, such as Wordpad, to view and search. To access a list of Chicago Police Department - Illinois Uniform Crime Reporting (IUCR) codes, go to http://data.cityofchicago.org/Public-Safety/Chicago-Police-Department-Illinois-Uniform-Crime-R/c7ck-438e

  3. m

    Data from: A generalized R-matrix propagation program for solving coupled...

    • data.mendeley.com
    Updated Jan 1, 1984
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lesley A. Morgan (1984). A generalized R-matrix propagation program for solving coupled second-order differential equations [Dataset]. http://doi.org/10.17632/txszxrvgbk.1
    Explore at:
    Dataset updated
    Jan 1, 1984
    Authors
    Lesley A. Morgan
    License

    https://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license/cpc-license/https://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license/cpc-license/

    Description

    Title of program: RPROP2 Catalogue Id: AAJK_v2_0 [AAJL]

    Nature of problem Coupled second-order differential equations which arise in electron collision with atoms, ions and molecules are solved over a given range of the independent variable. The R-matrix at one end of the range is calculated given the R-matrix at the other end of the range.

    Versions of this program held in the CPC repository in Mendeley Data AAJK_v1_0; RPROP; 10.1016/0010-4655(82)90177-1 AAJK_v2_0; RPROP2; 10.1016/0010-4655(84)90025-0

    This program has been imported from the CPC Program Library held at Queen's University Belfast (1969-2018)

  4. D

    Replication Data for: Subject Placement in the History of Latin

    • dataverse.azure.uit.no
    • dataverse.no
    txt
    Updated Sep 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lieven Danckaert; Lieven Danckaert (2023). Replication Data for: Subject Placement in the History of Latin [Dataset]. http://doi.org/10.18710/V9D674
    Explore at:
    txt(531057), txt(3449), txt(6149)Available download formats
    Dataset updated
    Sep 28, 2023
    Dataset provided by
    DataverseNO
    Authors
    Lieven Danckaert; Lieven Danckaert
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The present dataset was used in a corpus study on the diachrony of subject placement in the history of Latin, to appear in 'Catalan Journal of Linguistics'. The main file contains a set of Latin examples, which have all been annotated for a number of variables needed for the purpose of the study. A detailed description of the contents of this dataset is given in the README file. Finally there is a file with the R-code used to produce all the quantitative data mentioned in the paper. Below you can find the abstract of the article. Abstract The aim of this paper is to provide further support for one aspect of the analysis of Classical and Late Latin clause structure proposed in Danckaert (2017a), namely the diachrony of subject placement. According to the relevant proposal, one needs to distinguish an earlier grammar (‘Grammar A’, whose heyday is the period from ca. 200 BC until 200 AD), in which there is no A-movement for subjects, and a later grammar (‘Grammar B’, which is on the rise from ca. 50-100 AD, and fully productive from ca. 200 AD onwards), where subjects optionally move to the inflectional layer. Assuming the variationist acquisition model of language change developed in Yang (2000, 2002a,b), I present corpus evidence which confirms that it is only in the Late Latin period that TP-internal subjects fully establish themselves as a grammatical option.

  5. stock_TESLA

    • kaggle.com
    Updated Dec 13, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    willian oliveira gibin (2023). stock_TESLA [Dataset]. https://www.kaggle.com/datasets/willianoliveiragibin/stock-tesla
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 13, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    willian oliveira gibin
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The “Tesla Stock Price Data (Last One Year)” dataset is a comprehensive collection of historical stock market information, focusing on Tesla Inc. (TSLA) for the past year. This dataset serves as a valuable resource for financial analysts, investors, researchers, and data enthusiasts who are interested in studying the trends, patterns, and performance of Tesla’s stock in the financial markets.It consists of 9 columns referring to date, high and low prices, open and closing value, volume, cumulative open and of course changing of price.At a first glance in order to better understand the data we should plot the time series of each attribute.The cumulative Open Interest(OI) is the total open contracts that are being held in a particular Future or Call or Put contracts on the Exchange. We can see that the biggest drop of the stock happened in January of 2023 and after 5 to 6 months it regained its stock value round the summer of the same year with opening and closing price around 300.As a next step we are going to plot some more plots in order ro better understand the relation between our target column(change price) with every other attribute. In order to interpret the results:

    Linear Regression:

    Mean Absolute Error (MAE): 6.28 This model, on average, predicts the “Price Change” within approximately 6.28 units of the true value. Mean Squared Error (MSE): 52.97 MSE measures the average of squared differences, and this value suggests some variability in prediction errors. Root Mean Squared Error (RMSE): 7.28 RMSE is the square root of MSE and is in the same units as the target variable. An RMSE of 7.28 indicates the typical prediction error. R-squared (R2): 0.0868 R-squared represents the proportion of the variance in the target variable explained by the model. An R2 of 0.0868 suggests that the model explains only a small portion of the variance, indicating limited predictive power. Decision Tree Regression:

    Mean Absolute Error (MAE): 9.21 This model, on average, predicts the “Price Change” within approximately 9.21 units of the true value, which is higher than the Linear Regression model. Mean Squared Error (MSE): 150.69 The MSE is relatively high, indicating larger prediction errors and more variability. Root Mean Squared Error (RMSE): 12.28 RMSE of 12.28 is notably higher, suggesting that this model has larger prediction errors. R-squared (R2): -1.598 The negative R-squared value indicates that the model performs worse than a horizontal line as a predictor, indicating a poor fit. Random Forest Regression:

    Mean Absolute Error (MAE): 6.99 This model, on average, predicts the “Price Change” within approximately 6.99 units of the true value, similar to Linear Regression. Mean Squared Error (MSE): 62.79 MSE is lower than the Decision Tree model but higher than Linear Regression, suggesting intermediate prediction accuracy Root Mean Squared Error (RMSE): 7.92 RMSE is also intermediate, indicating moderate prediction errors. R-squared (R2): -0.0824 The negative R-squared suggests that the Random Forest model does not perform well and has limited predictive power.

  6. Z

    Film Circulation dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samoilova, Evgenia (Zhenya) (2024). Film Circulation dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7887671
    Explore at:
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Loist, Skadi
    Samoilova, Evgenia (Zhenya)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Complete dataset of “Film Circulation on the International Film Festival Network and the Impact on Global Film Culture”

    A peer-reviewed data paper for this dataset is in review to be published in NECSUS_European Journal of Media Studies - an open access journal aiming at enhancing data transparency and reusability, and will be available from https://necsus-ejms.org/ and https://mediarep.org

    Please cite this when using the dataset.

    Detailed description of the dataset:

    1 Film Dataset: Festival Programs

    The Film Dataset consists a data scheme image file, a codebook and two dataset tables in csv format.

    The codebook (csv file “1_codebook_film-dataset_festival-program”) offers a detailed description of all variables within the Film Dataset. Along with the definition of variables it lists explanations for the units of measurement, data sources, coding and information on missing data.

    The csv file “1_film-dataset_festival-program_long” comprises a dataset of all films and the festivals, festival sections, and the year of the festival edition that they were sampled from. The dataset is structured in the long format, i.e. the same film can appear in several rows when it appeared in more than one sample festival. However, films are identifiable via their unique ID.

    The csv file “1_film-dataset_festival-program_wide” consists of the dataset listing only unique films (n=9,348). The dataset is in the wide format, i.e. each row corresponds to a unique film, identifiable via its unique ID. For easy analysis, and since the overlap is only six percent, in this dataset the variable sample festival (fest) corresponds to the first sample festival where the film appeared. For instance, if a film was first shown at Berlinale (in February) and then at Frameline (in June of the same year), the sample festival will list “Berlinale”. This file includes information on unique and IMDb IDs, the film title, production year, length, categorization in length, production countries, regional attribution, director names, genre attribution, the festival, festival section and festival edition the film was sampled from, and information whether there is festival run information available through the IMDb data.

    2 Survey Dataset

    The Survey Dataset consists of a data scheme image file, a codebook and two dataset tables in csv format.

    The codebook “2_codebook_survey-dataset” includes coding information for both survey datasets. It lists the definition of the variables or survey questions (corresponding to Samoilova/Loist 2019), units of measurement, data source, variable type, range and coding, and information on missing data.

    The csv file “2_survey-dataset_long-festivals_shared-consent” consists of a subset (n=161) of the original survey dataset (n=454), where respondents provided festival run data for films (n=206) and gave consent to share their data for research purposes. This dataset consists of the festival data in a long format, so that each row corresponds to the festival appearance of a film.

    The csv file “2_survey-dataset_wide-no-festivals_shared-consent” consists of a subset (n=372) of the original dataset (n=454) of survey responses corresponding to sample films. It includes data only for those films for which respondents provided consent to share their data for research purposes. This dataset is shown in wide format of the survey data, i.e. information for each response corresponding to a film is listed in one row. This includes data on film IDs, film title, survey questions regarding completeness and availability of provided information, information on number of festival screenings, screening fees, budgets, marketing costs, market screenings, and distribution. As the file name suggests, no data on festival screenings is included in the wide format dataset.

    3 IMDb & Scripts

    The IMDb dataset consists of a data scheme image file, one codebook and eight datasets, all in csv format. It also includes the R scripts that we used for scraping and matching.

    The codebook “3_codebook_imdb-dataset” includes information for all IMDb datasets. This includes ID information and their data source, coding and value ranges, and information on missing data.

    The csv file “3_imdb-dataset_aka-titles_long” contains film title data in different languages scraped from IMDb in a long format, i.e. each row corresponds to a title in a given language.

    The csv file “3_imdb-dataset_awards_long” contains film award data in a long format, i.e. each row corresponds to an award of a given film.

    The csv file “3_imdb-dataset_companies_long” contains data on production and distribution companies of films. The dataset is in a long format, so that each row corresponds to a particular company of a particular film.

    The csv file “3_imdb-dataset_crew_long” contains data on names and roles of crew members in a long format, i.e. each row corresponds to each crew member. The file also contains binary gender assigned to directors based on their first names using the GenderizeR application.

    The csv file “3_imdb-dataset_festival-runs_long” contains festival run data scraped from IMDb in a long format, i.e. each row corresponds to the festival appearance of a given film. The dataset does not include each film screening, but the first screening of a film at a festival within a given year. The data includes festival runs up to 2019.

    The csv file “3_imdb-dataset_general-info_wide” contains general information about films such as genre as defined by IMDb, languages in which a film was shown, ratings, and budget. The dataset is in wide format, so that each row corresponds to a unique film.

    The csv file “3_imdb-dataset_release-info_long” contains data about non-festival release (e.g., theatrical, digital, tv, dvd/blueray). The dataset is in a long format, so that each row corresponds to a particular release of a particular film.

    The csv file “3_imdb-dataset_websites_long” contains data on available websites (official websites, miscellaneous, photos, video clips). The dataset is in a long format, so that each row corresponds to a website of a particular film.

    The dataset includes 8 text files containing the script for webscraping. They were written using the R-3.6.3 version for Windows.

    The R script “r_1_unite_data” demonstrates the structure of the dataset, that we use in the following steps to identify, scrape, and match the film data.

    The R script “r_2_scrape_matches” reads in the dataset with the film characteristics described in the “r_1_unite_data” and uses various R packages to create a search URL for each film from the core dataset on the IMDb website. The script attempts to match each film from the core dataset to IMDb records by first conducting an advanced search based on the movie title and year, and then potentially using an alternative title and a basic search if no matches are found in the advanced search. The script scrapes the title, release year, directors, running time, genre, and IMDb film URL from the first page of the suggested records from the IMDb website. The script then defines a loop that matches (including matching scores) each film in the core dataset with suggested films on the IMDb search page. Matching was done using data on directors, production year (+/- one year), and title, a fuzzy matching approach with two methods: “cosine” and “osa.” where the cosine similarity is used to match titles with a high degree of similarity, and the OSA algorithm is used to match titles that may have typos or minor variations.

    The script “r_3_matching” creates a dataset with the matches for a manual check. Each pair of films (original film from the core dataset and the suggested match from the IMDb website was categorized in the following five categories: a) 100% match: perfect match on title, year, and director; b) likely good match; c) maybe match; d) unlikely match; and e) no match). The script also checks for possible doubles in the dataset and identifies them for a manual check.

    The script “r_4_scraping_functions” creates a function for scraping the data from the identified matches (based on the scripts described above and manually checked). These functions are used for scraping the data in the next script.

    The script “r_5a_extracting_info_sample” uses the function defined in the “r_4_scraping_functions”, in order to scrape the IMDb data for the identified matches. This script does that for the first 100 films, to check, if everything works. Scraping for the entire dataset took a few hours. Therefore, a test with a subsample of 100 films is advisable.

    The script “r_5b_extracting_info_all” extracts the data for the entire dataset of the identified matches.

    The script “r_5c_extracting_info_skipped” checks the films with missing data (where data was not scraped) and tried to extract data one more time to make sure that the errors were not caused by disruptions in the internet connection or other technical issues.

    The script “r_check_logs” is used for troubleshooting and tracking the progress of all of the R scripts used. It gives information on the amount of missing values and errors.

    4 Festival Library Dataset

    The Festival Library Dataset consists of a data scheme image file, one codebook and one dataset, all in csv format.

    The codebook (csv file “4_codebook_festival-library_dataset”) offers a detailed description of all variables within the Library Dataset. It lists the definition of variables, such as location and festival name, and festival categories, units of measurement, data sources and coding and missing data.

    The csv file “4_festival-library_dataset_imdb-and-survey” contains data on all unique festivals collected from both IMDb and survey sources. This dataset appears in wide format, all information for each festival is listed in one row. This

  7. Integration of Slurry Separation Technology & Refrigeration Units: Air...

    • catalog.data.gov
    • datasets.ai
    Updated Jun 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.usaid.gov (2024). Integration of Slurry Separation Technology & Refrigeration Units: Air Quality - H2S [Dataset]. https://catalog.data.gov/dataset/integration-of-slurry-separation-technology-refrigeration-units-air-quality-h2s-4af17
    Explore at:
    Dataset updated
    Jun 25, 2024
    Dataset provided by
    United States Agency for International Developmenthttps://usaid.gov/
    Description

    This is the raw H2S data- concentration of H2S in parts per million in the biogas. Each sheet (tab) is formatted to be exported as a .csv for use with the R-code (AQ-June20.R). In order for this code to work properly, it is important that this file remain intact. Do not change the column names or codes for data, for example. And to be safe, don’t even sort. One simple change in the excel file could make the code full of bugs.

  8. W

    HUN AWRA-R calibration nodes v01

    • cloud.csiss.gmu.edu
    • researchdata.edu.au
    • +2more
    zip
    Updated Dec 14, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Australia (2019). HUN AWRA-R calibration nodes v01 [Dataset]. https://cloud.csiss.gmu.edu/uddi/dataset/f2da394a-3d08-4cf4-8c24-bf7751ea06a1
    Explore at:
    zip(11340)Available download formats
    Dataset updated
    Dec 14, 2019
    Dataset provided by
    Australia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract

    The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.

    This dataset is a shapefile which is a subset for the Hunter subregion containing geographical locations and other characteristics (see below) of streamflow gauging stations. T

    There are 3 files that have been extracted from the Hydstra database to aid in identifying sites in the Hunter subregion and the type of data collected from each on.

    the 3 files are:

    Site - lists all sites available in Hydstra from data providers. The data provider is listed in the #Station as _xxx. For example, sites in NSW are _77, QLD are _66.

    Some sites do not have locational information and will not be able to be plotted.

    Period - the period table lists all the variables that are recorded at each site and the period of record.

    Variable - the variable table shows variable codes and names which can be linked to the period table.

    Purpose

    Locations are used as pour points in order to define reach areas for river system modelling.

    Dataset History

    Subset of data for the Hunter subregion that was extracted from the Bureau of Meteorology's hydstra system and includes all gauges where data has been received from the lead water agency of each jurisdiction. The gauges shapefile for all bioregions was intersected with the Hunter subregion boundary to identify and extract gauges within the subregion.

    Dataset Citation

    Bioregional Assessment Programme (2016) HUN AWRA-R calibration nodes v01. Bioregional Assessment Derived Dataset. Viewed 13 March 2019, http://data.bioregionalassessments.gov.au/dataset/f2da394a-3d08-4cf4-8c24-bf7751ea06a1.

    Dataset Ancestors

  9. Data from: Order Hymenoptera, family Formicidae

    • gbif.org
    • bionomia.net
    • +3more
    Updated Nov 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cedric A. Collingwood; Donat Agosti; Mostafa R. Sharaf; Antonius van Harten; Cedric A. Collingwood; Donat Agosti; Mostafa R. Sharaf; Antonius van Harten (2024). Order Hymenoptera, family Formicidae [Dataset]. http://doi.org/10.5281/zenodo.1168586
    Explore at:
    Dataset updated
    Nov 26, 2024
    Dataset provided by
    Plazi
    Global Biodiversity Information Facilityhttps://www.gbif.org/
    Authors
    Cedric A. Collingwood; Donat Agosti; Mostafa R. Sharaf; Antonius van Harten; Cedric A. Collingwood; Donat Agosti; Mostafa R. Sharaf; Antonius van Harten
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset contains the digitized treatments in Plazi based on the original journal article Cedric A. Collingwood, Donat Agosti, Mostafa R. Sharaf, Antonius van Harten (2011): Order Hymenoptera, family Formicidae. Arthropod fauna of the UAE 4: 1-70, DOI: 10.5281/zenodo.1168586

  10. D

    Replication Data for: A Corpus Based Analysis of V2 Variation in West...

    • dataverse.no
    • dataverse.azure.uit.no
    • +1more
    bin, csv +2
    Updated Sep 28, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chloé Lybaert; Chloé Lybaert; Bernard De Clerck; Bernard De Clerck; Jorien Saelens; Ludovic De Cuypere; Ludovic De Cuypere; Jorien Saelens (2023). Replication Data for: A Corpus Based Analysis of V2 Variation in West Flemish and French Flemish Dialects [Dataset]. http://doi.org/10.18710/NSFN2B
    Explore at:
    csv(823), txt(15549), text/comma-separated-values(93006), csv(85373), bin(14055)Available download formats
    Dataset updated
    Sep 28, 2023
    Dataset provided by
    DataverseNO
    Authors
    Chloé Lybaert; Chloé Lybaert; Bernard De Clerck; Bernard De Clerck; Jorien Saelens; Ludovic De Cuypere; Ludovic De Cuypere; Jorien Saelens
    License

    https://dataverse.no/api/datasets/:persistentId/versions/1.2/customlicense?persistentId=doi:10.18710/NSFN2Bhttps://dataverse.no/api/datasets/:persistentId/versions/1.2/customlicense?persistentId=doi:10.18710/NSFN2B

    Time period covered
    1960 - 1970
    Area covered
    French, Belgium, Oost-Vlaanderen, France, Département du Nord - Département 59, West-Vlaanderen, Belgium
    Description

    Dataset abstract The dataset includes an annotated dataset of N = 1413 sentences (or parts thereof) taken from an authentic spoken corpus data from West Flemish and French Flemish (Dialects of Dutch). The sentences are annotated for V2 variation (Subject-Verb inversion, the outcome variable of the associated study) and seven predictor variables, including city, region, prosodic integration, form and function of the topicalized constituent, form of the subject, and the number of constituents in the prefield on (non)inverted word order. The dataset also includes geographical data to create a dialect map showing the relative frequencies of V2 variation. An R Notebook with the data analysis is provided. Article abstract This paper explores V2 variation in West Flemish and French Flemish dialects of Dutch based on an extensive corpus of authentic spoken data. After taking stock of the existing literature, we probe into the effect of region, prosodic integration, form and function of the topicalized constituent, form of the subject, and the number of constituents in the prefield on (non)inverted word order. This is the first study that carries out regression analysis on the combined impact of these variables in the entire West Flemish and French Flemish region, with additional visualization of effect sizes. The results show that noninversion is generally more widespread than originally anticipated, with unexpected higher occurrence of noninversion in continental West Flemish and lower frequencies in western West Flemish. With the exception of the variable number of constituents in the prefield, all other variables had a significant impact on word order: Clausal topicalized elements, elements that have peripheral functions, and elements that lack prosodic integration all favor noninverted word order. The form of the subject also impacted word order, but its effect is sometimes overruled by discourse considerations.

  11. GAL Surface Water Reaches for Risk and Impact Analysis 20180803

    • researchdata.edu.au
    • data.gov.au
    • +1more
    Updated Dec 7, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bioregional Assessment Program (2018). GAL Surface Water Reaches for Risk and Impact Analysis 20180803 [Dataset]. https://researchdata.edu.au/gal-surface-water-analysis-20180803/2989417
    Explore at:
    Dataset updated
    Dec 7, 2018
    Dataset provided by
    Data.govhttps://data.gov/
    Authors
    Bioregional Assessment Program
    License

    Attribution 2.5 (CC BY 2.5)https://creativecommons.org/licenses/by/2.5/
    License information was derived automatically

    Description

    Abstract \r

    \r The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement.\r \r The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.\r \r

    Dataset History \r

    \r Stream network constructed and defined using datasets shown in the Lineage.\r \r The stream network constructed using surface water nodes to define reaches and the the classification was assigned by using the data from the stream network from the lineage and then assigned the following classfication: \r \r 1.\tsurface water change due to hydrology\r \r 2.\tno change modelled at link node within PAE\r \r 3.\tmodelled no change at link node\r \r 4.\tmodelled change at link node\r \r 5. assumed change due to proximity to mine pit\r \r 6. assumed change due to hydrology\r \r Further tie-breaks were decide based on stream order or stream segment length.\r \r

    Dataset Citation \r

    \r Bioregional Assessment Programme (2017) GAL Surface Water Reaches for Risk and Impact Analysis 20180803. Bioregional Assessment Derived Dataset. Viewed 12 December 2018, http://data.bioregionalassessments.gov.au/dataset/64c4d16f-bdfa-4fd6-bd72-c459503003bd.\r \r

    Dataset Ancestors \r

    \r * Derived From Onsite and offsite mine infrastructure for the Carmichael Coal Mine and Rail Project, Adani Mining Pty Ltd 2012\r \r * Derived From Alpha Coal Project Environmental Impact Statement\r \r * Derived From Geofabric Surface Cartography - V2.1\r \r * Derived From QLD Exploration and Production Tenements (20140728)\r \r * Derived From China Stone Coal Project initial advice statement\r \r * Derived From Kevin's Corner Project Environmental Impact Statement\r \r * Derived From Galilee surface water modelling nodes\r \r * Derived From Geoscience Australia GEODATA TOPO series - 1:1 Million to 1:10 Million scale\r \r * Derived From China First Galilee Coal Project Environmental Impact Assessment\r \r * Derived From GEODATA TOPO 250K Series 3\r \r * Derived From Seven coal mines included in Galilee surface water modelling\r \r

  12. d

    HUN AWRA-R Gauge Station Cross Sections v01

    • data.gov.au
    • researchdata.edu.au
    • +2more
    zip
    Updated Nov 20, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bioregional Assessment Program (2019). HUN AWRA-R Gauge Station Cross Sections v01 [Dataset]. https://data.gov.au/data/dataset/activity/93fbc2b9-463c-42f6-8817-cb45a54ee28e
    Explore at:
    zip(12360)Available download formats
    Dataset updated
    Nov 20, 2019
    Dataset provided by
    Bioregional Assessment Program
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract

    This dataset was supplied to the Bioregional Assessment Programme by a third party and is presented here as originally supplied. The metadata was not provided by the data supplier and has been compiled by the programme based on known details.

    River cross sections for selected gauging stations in the Hunter subregion extracted from the PINNEENA CM-version 10.2 on DVD released in May 2014. The data is in a comma separated file (CSV). The relevant information contained in the CSV is as follows: Site, cross section ID, order (measurement), chain and level. This dataset was supplied to the Bioregional Assessment Programme by a third party and are presented here as originally supplied. Metadata was not provided and has been compiled by the Bioregional Assessment Programme based on known details at the time of acquisition.

    Purpose

    The cross-sections are used in river modelling to determine river reach volumes and gains (e.g. rainfall on river) and losses (e.g. leakage to groundwater, evapotranspiration from river).

    Dataset History

    This dataset was supplied to the Bioregional Assessment Programme by the New South Wales Office of Water through the PINNEENA CM-version 10.2 on DVD released in May 2014, and are presented here as originally supplied.

    Dataset Citation

    Bioregional Assessment Programme (2016) HUN AWRA-R Gauge Station Cross Sections v01. Bioregional Assessment Source Dataset. Viewed 13 March 2019, http://data.bioregionalassessments.gov.au/dataset/93fbc2b9-463c-42f6-8817-cb45a54ee28e.

  13. iEEG-Multicenter-Dataset

    • openneuro.org
    Updated Dec 2, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adam Li; Sara Inati; Kareem Zaghloul; Nathan Crone; William Anderson; Emily Johnson; Iahn Cajigas; Damian Brusko; Jonathan Jagid; Angel Claudio; Andres Kanner; Jennifer Hopp; Stephanie Chen; Jennifer Haagensen; Sridevi Sarma (2020). iEEG-Multicenter-Dataset [Dataset]. http://doi.org/10.18112/openneuro.ds003029.v1.0.1
    Explore at:
    Dataset updated
    Dec 2, 2020
    Dataset provided by
    OpenNeurohttps://openneuro.org/
    Authors
    Adam Li; Sara Inati; Kareem Zaghloul; Nathan Crone; William Anderson; Emily Johnson; Iahn Cajigas; Damian Brusko; Jonathan Jagid; Angel Claudio; Andres Kanner; Jennifer Hopp; Stephanie Chen; Jennifer Haagensen; Sridevi Sarma
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Fragility Multi-Center Retrospective Study

    iEEG and EEG data from 5 centers is organized in our study with a total of 100 subjects. We publish 4 centers' dataset here due to data sharing issues.

    Acquisitions include ECoG and SEEG. Each run specifies a different snapshot of EEG data from that specific subject's session. For seizure sessions, this means that each run is a EEG snapshot around a different seizure event.

    For additional clinical metadata about each subject, refer to the clinical Excel table in the publication.

    Data Availability

    NIH, JHH, UMMC, and UMF agreed to share. Cleveland Clinic did not, so requires an additional DUA.

    All data, except for Cleveland Clinic was approved by their centers to be de-identified and shared. All data in this dataset have no PHI, or other identifiers associated with patient. In order to access Cleveland Clinic data, please forward all requests to Amber Sours, SOURSA@ccf.org:

    Amber Sours, MPH Research Supervisor | Epilepsy Center Cleveland Clinic | 9500 Euclid Ave. S3-399 | Cleveland, OH 44195 (216) 444-8638

    You will need to sign a data use agreement (DUA).

    Sourcedata

    For each subject, there was a raw EDF file, which was converted into the BrainVision format with mne_bids. Each subject with SEEG implantation, also has an Excel table, called electrode_layout.xlsx, which outlines where the clinicians marked each electrode anatomically. Note that there is no rigorous atlas applied, so the main points of interest are: WM, GM, VENTRICLE, CSF, and OUT, which represent white-matter, gray-matter, ventricle, cerebrospinal fluid and outside the brain. WM, Ventricle, CSF and OUT were removed channels from further analysis. These were labeled in the corresponding BIDS channels.tsv sidecar file as status=bad. The dataset uploaded to openneuro.org does not contain the sourcedata since there was an extra anonymization step that occurred when fully converting to BIDS.

    Derivatives

    Derivatives include: * fragility analysis * frequency analysis * graph metrics analysis * figures

    These can be computed by following the following paper: Neural Fragility as an EEG Marker for the Seizure Onset Zone

    Events and Descriptions

    Within each EDF file, there contain event markers that are annotated by clinicians, which may inform you of specific clinical events that are occuring in time, or of when they saw seizures onset and offset (clinical and electrographic).

    During a seizure event, specifically event markers may follow this time course:

    * eeg onset, or clinical onset - the onset of a seizure that is either marked electrographically, or by clinical behavior. Note that the clinical onset may not always be present, since some seizures manifest without clinical behavioral changes.
    * Marker/Mark On - these are usually annotations within some cases, where a health practitioner injects a chemical marker for use in ICTAL SPECT imaging after a seizure occurs. This is commonly done to see which portions of the brain are active metabolically.
    * Marker/Mark Off - This is when the ICTAL SPECT stops imaging.
    * eeg offset, or clinical offset - this is the offset of the seizure, as determined either electrographically, or by clinical symptoms.
    

    Other events included may be beneficial for you to understand the time-course of each seizure. Note that ICTAL SPECT occurs in all Cleveland Clinic data. Note that seizure markers are not consistent in their description naming, so one might encode some specific regular-expression rules to consistently capture seizure onset/offset markers across all dataset. In the case of UMMC data, all onset and offset markers were provided by the clinicians on an Excel sheet instead of via the EDF file. So we went in and added the annotations manually to each EDF file.

    Seizure Electrographic and Clinical Onset Annotations

    For various datasets, there are seizures present within the dataset. Generally there is only one seizure per EDF file. When seizures are present, they are marked electrographically (and clinically if present) via standard approaches in the epilepsy clinical workflow.

    Clinical onset are just manifestation of the seizures with clinical syndromes. Sometimes the maker may not be present.

    Seizure Onset Zone Annotations

    What is actually important in the evaluation of datasets is the clinical annotations of their localization hypotheses of the seizure onset zone.

    These generally include:

    * early onset: the earliest onset electrodes participating in the seizure that clinicians saw
    * early/late spread (optional): the electrodes that showed epileptic spread activity after seizure onset. Not all seizures has spread contacts annotated.
    

    Surgical Zone (Resection or Ablation) Annotations

    For patients with the post-surgical MRI available, then the segmentation process outlined above tells us which electrodes were within the surgical removed brain region.

    Otherwise, clinicians give us their best estimate, of which electrodes were resected/ablated based on their surgical notes.

    For surgical patients whose postoperative medical records did not explicitly indicate specific resected or ablated contacts, manual visual inspection was performed to determine the approximate contacts that were located in later resected/ablated tissue. Postoperative T1 MRI scans were compared against post-SEEG implantation CT scans or CURRY coregistrations of preoperative MRI/post SEEG CT scans. Contacts of interest in and around the area of the reported resection were selected individually and the corresponding slice was navigated to on the CT scan or CURRY coregistration. After identifying landmarks of that slice (e.g. skull shape, skull features, shape of prominent brain structures like the ventricles, central sulcus, superior temporal gyrus, etc.), the location of a given contact in relation to these landmarks, and the location of the slice along the axial plane, the corresponding slice in the postoperative MRI scan was navigated to. The resected tissue within the slice was then visually inspected and compared against the distinct landmarks identified in the CT scans, if brain tissue was not present in the corresponding location of the contact, then the contact was marked as resected/ablated. This process was repeated for each contact of interest.

    References

    Adam Li, Chester Huynh, Zachary Fitzgerald, Iahn Cajigas, Damian Brusko, Jonathan Jagid, Angel Claudio, Andres Kanner, Jennifer Hopp, Stephanie Chen, Jennifer Haagensen, Emily Johnson, William Anderson, Nathan Crone, Sara Inati, Kareem Zaghloul, Juan Bulacio, Jorge Gonzalez-Martinez, Sridevi V. Sarma. Neural Fragility as an EEG Marker of the Seizure Onset Zone. bioRxiv 862797; doi: https://doi.org/10.1101/862797

    Appelhoff, S., Sanderson, M., Brooks, T., Vliet, M., Quentin, R., Holdgraf, C., Chaumon, M., Mikulan, E., Tavabi, K., Höchenberger, R., Welke, D., Brunner, C., Rockhill, A., Larson, E., Gramfort, A. and Jas, M. (2019). MNE-BIDS: Organizing electrophysiological data into the BIDS format and facilitating their analysis. Journal of Open Source Software 4: (1896). https://doi.org/10.21105/joss.01896

    Holdgraf, C., Appelhoff, S., Bickel, S., Bouchard, K., D'Ambrosio, S., David, O., … Hermes, D. (2019). iEEG-BIDS, extending the Brain Imaging Data Structure specification to human intracranial electrophysiology. Scientific Data, 6, 102. https://doi.org/10.1038/s41597-019-0105-7

    Pernet, C. R., Appelhoff, S., Gorgolewski, K. J., Flandin, G., Phillips, C., Delorme, A., Oostenveld, R. (2019). EEG-BIDS, an extension to the brain imaging data structure for electroencephalography. Scientific Data, 6, 103. https://doi.org/10.1038/s41597-019-0104-8

  14. D

    Replication data for: Old Church Slavonic byti Part One and Part Two

    • dataverse.no
    • search.dataone.org
    pdf +2
    Updated Sep 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hanne M. Eckhoff; Tore Nesset; Laura A. Janda; Hanne M. Eckhoff; Tore Nesset; Laura A. Janda (2023). Replication data for: Old Church Slavonic byti Part One and Part Two [Dataset]. http://doi.org/10.18710/P9REAV
    Explore at:
    text/plain; charset=us-ascii(1162043), text/plain; charset=utf-8(5405111), pdf(129969), text/plain; charset=utf-8(5405110), text/plain; charset=utf-8(3035), text/plain; charset=utf-8(2673), text/plain; charset=utf-8(14170), text/plain; charset=utf-8(5674)Available download formats
    Dataset updated
    Sep 28, 2023
    Dataset provided by
    DataverseNO
    Authors
    Hanne M. Eckhoff; Tore Nesset; Laura A. Janda; Hanne M. Eckhoff; Tore Nesset; Laura A. Janda
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Eastern Europe, Norway
    Dataset funded by
    The Research Council of Norway
    Description

    Abstract Part One. There is controversy over whether byti ‘be’ in Old Church Slavonic functioned as an imperfective verb with an unusually large number of inflected forms or as an aspectual pair of verbs, reflecting its suppletive origin from two stems (es- and bū-). We offer an objective empirical approach to the status of this verb, using statistical analysis of 2,428 attestations of byti in comparison with 9,694 attestations of 129 other verbs. This makes it possible to accurately locate byti in the context of the verbal lexicon of Old Church Slavonic. The comparison is made via grammatical profiles, a method that examines the frequency distribution of each verb’s inflected forms. This comparison is undertaken in two rounds, one assuming that byti is a single verb, and the other assuming that it is a pair of verbs. Both assumptions yield reasonable results, and although the grammatical profile analyses do not suffice to solve the controversy, they lay the groundwork for further analysis in Part Two that argues for a single-verb interpretation of byti. Data and R Scripts Part One: The Dat a Our analysis uses two datasets, one that presents the forms of byti as a single paradigm, verbs.csv, and one that presents it as a pair of verbs, splitverbs.csv. The R Scripts In order to represent the Church Slavonic orthography, you will need our transliteration script: translit.r. This script is sourced by the scripts for our analysis which present byti as either a single verb or a verb pair: PartOneSingleVerb.r and PartOneVerbPair.r . This script performs all of the steps for the analysis in our article and generates the plots. Abstract Part Two: The verb byti ‘be’ in Old Church Slavonic appears in an unusually rich inventory of grammatical constructions that it appears in. We analyze corpus data on the distribution of constructions in order to assess the status of this verb as either a single verb or an aspectual pair of verbs. Our study moves beyond a strict structuralist interpretation of the behavior of byti, instead recognizing the real variation and ambiguity in the data. Our findings make both theoretical and descriptive advances. The radial category structure is a central tenet of cognitive linguistics, but until now such structures have usually been posited by researchers based on their qualitative insights from data. We show that it is possible to identify both the nodes and the structure of a radial category statistically, using only linguistic data as input. We provide an enhanced description of byti that clearly distinguishes between core uses and those that are more peripheral and shows the relationships among them. While we find some evidence in support of an aspectual pair, most evidence points instead toward a single verb. Data and R Script Part Two: The Data The dataset used in this analysis is frames.csv. The R Script The R script used in this analysis is PartTwo.r.

  15. d

    Replication Data for: Understanding ‘many’ through the lens of Ukrainian...

    • search-demo.dataone.org
    • dataverse.no
    • +1more
    Updated Sep 25, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Janda, Laura Alexis (2024). Replication Data for: Understanding ‘many’ through the lens of Ukrainian багато [Dataset]. http://doi.org/10.18710/Y7VGQE
    Explore at:
    Dataset updated
    Sep 25, 2024
    Dataset provided by
    DataverseNO
    Authors
    Janda, Laura Alexis
    Time period covered
    Jan 1, 1742 - Jan 1, 2023
    Description

    Dataset description: The General Regionally Annotated Corpus of Ukrainian (GRAC, Shvedova et al. 2017-2024, uacorpus.org) was consulted to collect data for further analysis concerning the distribution of Singular vs. Plural verb forms in the target bahato construction. GRAC is a Sketch Engine corpus of over 1.8 billion words, representing texts from over 30,000 authors created between 1816 and 2023. This corpus is designed to serve as source material for linguistic research on Standard Ukrainian. Our data was collected during the month of February 2024. We extracted and annotated 28,491 examples of the bahato construction. An additional set of examples was collected from the Russian National Corpus (ruscorpora.ru) during the month of August 2024 to provide comparison with the Russian mnogo construction. For this purpose, 6,612 examples were extracted and annotated for word order and Singular vs. Plural verb agreement. Both the Ukrainian and the Russian data are included in this dataset, along with the R scripts used to analyze this data. Article abstract: We reveal an ongoing language change in Ukrainian involving a construction with a subject comprised of the indefinite quantifier багато ‘many’ modifying a noun phrase in the Genitive Plural. Number agreement on the verb varies, allowing both Singular (in 69.1% of attestations) and Plural (in 30.9% of attestations). Based on statistical analysis of corpus data, we investigate the influence of the factors of year of creation, word order of subject and verb, and animacy of the subject on the choice of verb number. We find that, while all combinations of word order and animacy are robustly attested, VS word order and inanimate subjects tend to prefer Singular, whereas SV word order and animate subjects tend to prefer Plural. Since about the 1950s, the proportion of Plural has been increasing, overtaking Singular in the current decade. We propose that this Singular vs. Plural variation is motivated by the human embodied experience of construing a group of items as either a homogeneous mass (and therefore Singular) or a multiplicity of individuals (and therefore Plural). This proposal is supported by the identification of micro-constructions that prefer Singular and show reduced individuation of human beings.

  16. o

    Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program...

    • openicpsr.org
    Updated May 18, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jacob Kaplan (2018). Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program Data: Hate Crime Data 1991-2019 [Dataset]. http://doi.org/10.3886/E103500V7
    Explore at:
    Dataset updated
    May 18, 2018
    Dataset provided by
    University of Pennsylvania
    Authors
    Jacob Kaplan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    1991 - 2019
    Area covered
    United States
    Description

    !!!WARNING~~~This dataset has a large number of flaws and is unable to properly answer many questions that people generally use it to answer, such as whether national hate crimes are changing (or at least they use the data so improperly that they get the wrong answer). A large number of people using this data (academics, advocates, reporting, US Congress) do so inappropriately and get the wrong answer to their questions as a result. Indeed, many published papers using this data should be retracted. Before using this data I highly recommend that you thoroughly read my book on UCR data, particularly the chapter on hate crimes (https://ucrbook.com/hate-crimes.html) as well as the FBI's own manual on this data. The questions you could potentially answer well are relatively narrow and generally exclude any causal relationships. ~~~WARNING!!!Version 8 release notes:Adds 2019 dataVersion 7 release notes:Changes release notes description, does not change data.Version 6 release notes:Adds 2018 dataVersion 5 release notes:Adds data in the following formats: SPSS, SAS, and Excel.Changes project name to avoid confusing this data for the ones done by NACJD.Adds data for 1991.Fixes bug where bias motivation "anti-lesbian, gay, bisexual, or transgender, mixed group (lgbt)" was labeled "anti-homosexual (gay and lesbian)" prior to 2013 causing there to be two columns and zero values for years with the wrong label.All data is now directly from the FBI, not NACJD. The data initially comes as ASCII+SPSS Setup files and read into R using the package asciiSetupReader. All work to clean the data and save it in various file formats was also done in R. Version 4 release notes: Adds data for 2017.Adds rows that submitted a zero-report (i.e. that agency reported no hate crimes in the year). This is for all years 1992-2017. Made changes to categorical variables (e.g. bias motivation columns) to make categories consistent over time. Different years had slightly different names (e.g. 'anti-am indian' and 'anti-american indian') which I made consistent. Made the 'population' column which is the total population in that agency. Version 3 release notes: Adds data for 2016.Order rows by year (descending) and ORI.Version 2 release notes: Fix bug where Philadelphia Police Department had incorrect FIPS county code. The Hate Crime data is an FBI data set that is part of the annual Uniform Crime Reporting (UCR) Program data. This data contains information about hate crimes reported in the United States. Please note that the files are quite large and may take some time to open.Each row indicates a hate crime incident for an agency in a given year. I have made a unique ID column ("unique_id") by combining the year, agency ORI9 (the 9 character Originating Identifier code), and incident number columns together. Each column is a variable related to that incident or to the reporting agency. Some of the important columns are the incident date, what crime occurred (up to 10 crimes), the number of victims for each of these crimes, the bias motivation for each of these crimes, and the location of each crime. It also includes the total number of victims, total number of offenders, and race of offenders (as a group). Finally, it has a number of columns indicating if the victim for each offense was a certain type of victim or not (e.g. individual victim, business victim religious victim, etc.). The only changes I made to the data are the following. Minor changes to column names to make all column names 32 characters or fewer (so it can be saved in a Stata format), made all character values lower case, reordered columns. I also generated incident month, weekday, and month-day variables from the incident date variable included in the original data.

  17. threads-stack-overflow

    • zenodo.org
    json
    Updated Dec 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicholas Landry; Nicholas Landry (2023). threads-stack-overflow [Dataset]. http://doi.org/10.5281/zenodo.10373328
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Dec 16, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Nicholas Landry; Nicholas Landry
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Overview

    This is a temporal higher-order network dataset, which here means a sequence of timestamped hyperedges where each hyperedge is a set of nodes. In this dataset, nodes are users on stackoverflow.com, and a hyperedge comes from users participating in a thread that lasts for at most 24 hours. The timestamps are the time of the post, but normalized so that the earliest post starts at 0.

    Source of original data

    Source: threads-stack-overflow dataset

    References

    If you use this data, please cite the following paper:

  18. d

    Replication Data for: Together and apart: Perfective verbs with a prefix and...

    • search.dataone.org
    • dataverse.harvard.edu
    • +2more
    Updated Jul 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nordrum, Maria (2024). Replication Data for: Together and apart: Perfective verbs with a prefix and the semelfactive suffix -nu- in Contemporary Standard Russian [Dataset]. https://search.dataone.org/view/sha256%3Aa893825028302f869d69c11adfd09f9b1fe2d1d0246f1f5083baa2192aa71610
    Explore at:
    Dataset updated
    Jul 29, 2024
    Dataset provided by
    DataverseNO
    Authors
    Nordrum, Maria
    Time period covered
    Jan 1, 1950 - Jan 1, 2017
    Description

    This dataset includes all the data files that were used for the studies in my PhD dissertation "Together and apart: Perfective verbs with a prefix and the semelfactive suffix -nu- in Contemporary Standard Russian" (2019). Most of the files involve data tables with annotated corpus data from the Russian National Corpus (www.ruscorpora.ru). Every tabular data file is available in two formats: .csv and .xlsx. The data files are numbered so that they are shown in the same order as they become relevant in the dissertation, and their contents are described in the ReadMe files with the same number. In addition, the dataset contains the R script and .txt file that was used in order to make the plot in Chapter 2., The dissertation investigates Russian perfective verbs with a prefix and the semelfactive suffix -nu-, such as zaxlopnut’ ‘slam shut’, for the sake of simplicity referred to as “Pref-Nu verbs”. Pref-Nu verbs have received only marginal attention in the scholarly literature, and in the dissertation a main goal is therefore to shed light on their distribution and productivity in Contemporary Standard Russian, as well as the type of verb clusters and semantic classes they represent. A second main goal is to compare the behavior of Pref-Nu verbs with the behavior of perfective verbs that have only one of the two relevant affixes, i.e. only a prefix, such as zaxlopat’ ‘begin to slam’, or only the semelfactive suffix -nu , such as xlopnut’ ‘slam, clap, bang once’. All of these questions are explored with data from the Russian National Corpus (years 1950-2017), and, as a general tendency, Pref-Nu verbs are found to differ from the other two verb types in that they express a single “quantum” of an action that yields some result. The choice between Pref-Nu verbs and other perfectives is furthermore explored through an informant experiment that focuses on cases of near-synonymy between related verbs. The results of the experiment indicate that none of the verbs are fully synonymous, although their semantic differences are often subtle.

  19. m

    Data for: A systematic review showed no performance benefit of machine...

    • data.mendeley.com
    • search.datacite.org
    Updated Mar 14, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ben Van Calster (2019). Data for: A systematic review showed no performance benefit of machine learning over logistic regression for clinical prediction models [Dataset]. http://doi.org/10.17632/sypyt6c2mc.1
    Explore at:
    Dataset updated
    Mar 14, 2019
    Authors
    Ben Van Calster
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The uploaded files are:

    1) Excel file containing 6 sheets in respective Order: "Data Extraction" (summarized final data extractions from the three reviewers involved), "Comparison Data" (data related to the comparisons investigated), "Paper level data" (summaries at paper level), "Outcome Event Data" (information with respect to number of events for every outcome investigated within a paper), "Tuning Classification" (data related to the manner of hyperparameter tuning of Machine Learning Algorithms).

    2) R script used for the Analysis (In order to read the data, please: Save "Comparison Data", "Paper level data", "Outcome Event Data" Excel sheets as txt files. In the R script srpap: Refers to the "Paper level data" sheet, srevents: Refers to the "Outcome Event Data" sheet and srcompx: Refers to " Comparison data Sheet".

    3) Supplementary Material: Including Search String, Tables of data, Figures

    4) PRISMA checklist items

  20. g

    NOAA GOES-R Series Geostationary Lightning Mapper (GLM) Level 0 Data

    • gimi9.com
    • catalog.data.gov
    Updated Sep 19, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). NOAA GOES-R Series Geostationary Lightning Mapper (GLM) Level 0 Data [Dataset]. https://gimi9.com/dataset/data-gov_noaa-goes-r-series-geostationary-lightning-mapper-glm-level-0-data1
    Explore at:
    Dataset updated
    Sep 19, 2023
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This data collection consists of archived Geostationary Operational Environmental Satellite-R (GOES-R) Series Geostationary Lightning Mapper (GLM) Level 0 data from the GOES-East and GOES-West satellites in the operational (OPS) and the post-launch test (PLT) phases. The GOES-R Series provides continuity of the GOES mission through 2035 and improvements in geostationary satellite observational data. GOES-16, the first GOES-R satellite, began operating as GOES-East on December 18, 2017. GOES-17 began operating as GOES-West on February 12, 2019. GOES-T launched on March 1, 2022, and was renamed to GOES-18 on March 14, 2022. GOES-U, the final satellite in the series, is scheduled to launch in 2024. GLM is a near-infrared optical transient detector observing the Western Hemisphere. The GLM Level 0 data are composed of Consultative Committee for Space Data Systems (CCSDS) packets containing the science, housekeeping, engineering, and diagnostic telemetry data downlinked from the instrument. The Level 0 data files also contain orbit and attitude/angular rate packets generated by the GOES spacecraft. Each CCSDS packet contains a unique Application Process Identifier (APID) in the primary header that identifies the specific type of packet, and is used to support interpretation of its contents. Users may refer to the GOES-R Series Product Definition and Users’ Guide (PUG) Volume 1 (Main) and Volume 2 (Level 0 Products) for Level 0 data documentation. Related instrument calibration data and Level 1b processing information are archived and available for order at the NOAA CLASS website. The GLM Level 0 data files are delivered in a netCDF-4 file format, however, the constituent CCSDS packets are stored in a byte array making the data opaque for standard netCDF reader applications. The GLM Level 0 data files are packaged in hourly tar files (data bundles) by satellite for the archive. Recently ingested archive tar files are available for 14 days on an anonymous FTP server for users to download. Data archived on offline tape may be requested from NCEI.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
data.lacity.org (2021). Sign Line Task & Work Order Data [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/sign-line-task-work-order-data

Sign Line Task & Work Order Data

Explore at:
Dataset updated
Nov 29, 2021
Dataset provided by
data.lacity.org
Description

Sign (Task & Work Order) Information from eWork.

Search
Clear search
Close search
Google apps
Main menu