https://creativecommons.org/share-your-work/public-domain/pdmhttps://creativecommons.org/share-your-work/public-domain/pdm
This collection comprises unaltered data files downloaded from https://eddataexpress.ed.gov/download/data-library on February 6, 2025. The original access page consisted of a table with category filters, which provided links to data ZIP files containing the specified data fields. This table has been saved into tabular data formats here in the Index folder, with the original web links replaced with the matching ZIP filename only, which essentially replicates the functionality of the original web page in a downloadable format.In the website's underlying file structure, the original ZIP files were nested within folders named according to the format EID_####, apparently to avoid conflicts between files with the same name. These seeming duplications might have been due to updates or revisions that had to be made to a data file. To preserve this original order, the ZIP files were renamed by appending the EID number to their original file name. The files were not otherwise unzipped or altered in any way from their original state.At the time of download, the page at https://eddataexpress.ed.gov/download/data-library displayed the following two notices in red:"The COVID-19 pandemic disrupted the collection and reporting of data on EDE, beginning in SY 2019-20. The Department urges abundant caution when using the data and recommends reviewing the relevant data notes prior to use or interpretation. This includes data on state assessments, graduation rates, and chronic absenteeism.""WARNING: The data library functionality has stopped working temporarily for many SY2122 school files. Please go to the download tool page to download your data of interest. We apologize for the inconvenience."--------------------The "About Us" page from the ED Data Express website had this to say about its resources:Purpose of ED Data ExpressED Data Express is a website designed to improve the public's ability to access and explore high-value state- and district-level education data collected by the U.S. Department of Education. The site is designed to be interactive and to present the data in a clear, easy-to-use manner, with options to download information into Excel or to explore the data within the site's grant program dashboards. The site currently includes data from EDFacts, Consolidated State Performance Reports (CSPR), and the Department's Budget Service office. For more information about these topics, please visit the following web pages:https://www2.ed.gov/about/inits/ed/edfacts/index.html [see below for the text of the linked page]https://www2.ed.gov/about/offices/list/om/fs_po/ofo/budget-service.html [this URL was dead at the time of download]Using the SiteED Data Express includes two sections that allow users to access and view the data: (1) grant program data dashboards and (2) download functionality. The grant program data dashboards provide a snapshot of information on the funding, participation and performance of some of the grant programs administered by the U.S. Department of Education's Office of Elementary and Secondary Education. The dashboards are interactive and update depending on the program, state and school year selected. Additional information is provided through data notes as well as through the small "i" icon. The download functionality allows users to build customized tables of data and contain more data than what is available via the dashboards. The download functionality also allows users to download data notes which provide important caveats and contextual information to consider when using the data. Data Included and Frequency of UpdatesThe site currently includes funding, participation and performance data from school years 2010-11 to 2016-17 on formula grant programs administered in the Office of Elementary and Secondary Education. Additional data and data notes will be added to the site over time. Quality Control and Personally Identifiable InformationAll CSPR and EDFacts data are self-reported by each state. The U.S. Department of Education conducts a review of the data and provides feedback to states, but it is ultimately statesâ responsibility to verify and certify that their data are correct. Please note that during the reporting years represented on this site, the Office of Elementary and Secondary Education in collaboration with EDFacts and SEAs have wor
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Programming Languages Infrastructure as Code (PL-IaC) enables IaC programs written in general-purpose programming languages like Python and TypeScript. The currently available PL-IaC solutions are Pulumi and the Cloud Development Kits (CDKs) of Amazon Web Services (AWS) and Terraform. This dataset provides metadata and initial analyses of all public GitHub repositories in August 2022 with an IaC program, including their programming languages, applied testing techniques, and licenses. Further, we provide a shallow copy of the head state of those 7104 repositories whose licenses permit redistribution. The dataset is available under the Open Data Commons Attribution License (ODC-By) v1.0. Contents:
metadata.zip: The dataset metadata and analysis results as CSV files. scripts-and-logs.zip: Scripts and logs of the dataset creation. LICENSE: The Open Data Commons Attribution License (ODC-By) v1.0 text. README.md: This document. redistributable-repositiories.zip: Shallow copies of the head state of all redistributable repositories with an IaC program. This artifact is part of the ProTI Infrastructure as Code testing project: https://proti-iac.github.io. Metadata The dataset's metadata comprises three tabular CSV files containing metadata about all analyzed repositories, IaC programs, and testing source code files. repositories.csv:
ID (integer): GitHub repository ID url (string): GitHub repository URL downloaded (boolean): Whether cloning the repository succeeded name (string): Repository name description (string): Repository description licenses (string, list of strings): Repository licenses redistributable (boolean): Whether the repository's licenses permit redistribution created (string, date & time): Time of the repository's creation updated (string, date & time): Time of the last update to the repository pushed (string, date & time): Time of the last push to the repository fork (boolean): Whether the repository is a fork forks (integer): Number of forks archive (boolean): Whether the repository is archived programs (string, list of strings): Project file path of each IaC program in the repository programs.csv:
ID (string): Project file path of the IaC program repository (integer): GitHub repository ID of the repository containing the IaC program directory (string): Path of the directory containing the IaC program's project file solution (string, enum): PL-IaC solution of the IaC program ("AWS CDK", "CDKTF", "Pulumi") language (string, enum): Programming language of the IaC program (enum values: "csharp", "go", "haskell", "java", "javascript", "python", "typescript", "yaml") name (string): IaC program name description (string): IaC program description runtime (string): Runtime string of the IaC program testing (string, list of enum): Testing techniques of the IaC program (enum values: "awscdk", "awscdk_assert", "awscdk_snapshot", "cdktf", "cdktf_snapshot", "cdktf_tf", "pulumi_crossguard", "pulumi_integration", "pulumi_unit", "pulumi_unit_mocking") tests (string, list of strings): File paths of IaC program's tests testing-files.csv:
file (string): Testing file path language (string, enum): Programming language of the testing file (enum values: "csharp", "go", "java", "javascript", "python", "typescript") techniques (string, list of enum): Testing techniques used in the testing file (enum values: "awscdk", "awscdk_assert", "awscdk_snapshot", "cdktf", "cdktf_snapshot", "cdktf_tf", "pulumi_crossguard", "pulumi_integration", "pulumi_unit", "pulumi_unit_mocking") keywords (string, list of enum): Keywords found in the testing file (enum values: "/go/auto", "/testing/integration", "@AfterAll", "@BeforeAll", "@Test", "@aws-cdk", "@aws-cdk/assert", "@pulumi.runtime.test", "@pulumi/", "@pulumi/policy", "@pulumi/pulumi/automation", "Amazon.CDK", "Amazon.CDK.Assertions", "Assertions_", "HashiCorp.Cdktf", "IMocks", "Moq", "NUnit", "PolicyPack(", "ProgramTest", "Pulumi", "Pulumi.Automation", "PulumiTest", "ResourceValidationArgs", "ResourceValidationPolicy", "SnapshotTest()", "StackValidationPolicy", "Testing", "Testing_ToBeValidTerraform(", "ToBeValidTerraform(", "Verifier.Verify(", "WithMocks(", "[Fact]", "[TestClass]", "[TestFixture]", "[TestMethod]", "[Test]", "afterAll(", "assertions", "automation", "aws-cdk-lib", "aws-cdk-lib/assert", "aws_cdk", "aws_cdk.assertions", "awscdk", "beforeAll(", "cdktf", "com.pulumi", "def test_", "describe(", "github.com/aws/aws-cdk-go/awscdk", "github.com/hashicorp/terraform-cdk-go/cdktf", "github.com/pulumi/pulumi", "integration", "junit", "pulumi", "pulumi.runtime.setMocks(", "pulumi.runtime.set_mocks(", "pulumi_policy", "pytest", "setMocks(", "set_mocks(", "snapshot", "software.amazon.awscdk.assertions", "stretchr", "test(", "testing", "toBeValidTerraform(", "toMatchInlineSnapshot(", "toMatchSnapshot(", "to_be_valid_terraform(", "unittest", "withMocks(") program (string): Project file path of the testing file's IaC program Dataset Creation scripts-and-logs.zip contains all scripts and logs of the creation of this dataset. In it, executions/executions.log documents the commands that generated this dataset in detail. On a high level, the dataset was created as follows:
A list of all repositories with a PL-IaC program configuration file was created using search-repositories.py (documented below). The execution took two weeks due to the non-deterministic nature of GitHub's REST API, causing excessive retries. A shallow copy of the head of all repositories was downloaded using download-repositories.py (documented below). Using analysis.ipynb, the repositories were analyzed for the programs' metadata, including the used programming languages and licenses. Based on the analysis, all repositories with at least one IaC program and a redistributable license were packaged into redistributable-repositiories.zip, excluding any node_modules and .git directories. Searching Repositories The repositories are searched through search-repositories.py and saved in a CSV file. The script takes these arguments in the following order:
Github access token. Name of the CSV output file. Filename to search for. File extensions to search for, separated by commas. Min file size for the search (for all files: 0). Max file size for the search or * for unlimited (for all files: *). Pulumi projects have a Pulumi.yaml or Pulumi.yml (case-sensitive file name) file in their root folder, i.e., (3) is Pulumi and (4) is yml,yaml. https://www.pulumi.com/docs/intro/concepts/project/ AWS CDK projects have a cdk.json (case-sensitive file name) file in their root folder, i.e., (3) is cdk and (4) is json. https://docs.aws.amazon.com/cdk/v2/guide/cli.html CDK for Terraform (CDKTF) projects have a cdktf.json (case-sensitive file name) file in their root folder, i.e., (3) is cdktf and (4) is json. https://www.terraform.io/cdktf/create-and-deploy/project-setup Limitations The script uses the GitHub code search API and inherits its limitations:
Only forks with more stars than the parent repository are included. Only the repositories' default branches are considered. Only files smaller than 384 KB are searchable. Only repositories with fewer than 500,000 files are considered. Only repositories that have had activity or have been returned in search results in the last year are considered. More details: https://docs.github.com/en/search-github/searching-on-github/searching-code The results of the GitHub code search API are not stable. However, the generally more robust GraphQL API does not support searching for files in repositories: https://stackoverflow.com/questions/45382069/search-for-code-in-github-using-graphql-v4-api Downloading Repositories download-repositories.py downloads all repositories in CSV files generated through search-respositories.py and generates an overview CSV file of the downloads. The script takes these arguments in the following order:
Name of the repositories CSV files generated through search-repositories.py, separated by commas. Output directory to download the repositories to. Name of the CSV output file. The script only downloads a shallow recursive copy of the HEAD of the repo, i.e., only the main branch's most recent state, including submodules, without the rest of the git history. Each repository is downloaded to a subfolder named by the repository's ID.
The Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.
The Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
The dataset contains 65,000+ photo of more than 5,000 people from 40 countries, making it a valuable resource for exploring and developing identity verification solutions. This collection serves as a valuable resource for researchers and developers working on biometric verification solutions, especially in areas like facial recognition and financial services.
By utilizing this dataset, researchers can develop more robust re-identification algorithms, a key factor in ensuring privacy and security in various applications. - Get the data
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F22059654%2F1014bc8e62e232cc2ecb28e7d8ccdc3c%2F.png?generation=1730863166146276&alt=media" alt="">
This dataset offers a opportunity to explore re-identification challenges by providing 13 selfies of individuals against diverse backgrounds with different lighting, paired with 2 ID photos from different document types.
Devices: Samsung M31, Infinix note11, Tecno Pop 7, Samsung A05, Iphone 15 Pro Max and other
Resolution: 1000 x 750 and higher
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F22059654%2F0f1a70b3b5056e2610f22499cac19c7f%2FFrame%20136.png?generation=1730588713101089&alt=media" alt="">
This dataset enables the development of more robust and reliable authentication systems, ultimately contributing to enhancing customer onboarding experiences by streamlining verification processes, minimizing fraud, and improving overall security measures for a wide range of services, including online platforms, financial institutions, and government agencies.
The Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.
The Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.
Our Price Paid Data includes information on all property sales in England and Wales that are sold for value and are lodged with us for registration.
Get up to date with the permitted use of our Price Paid Data:
check what to consider when using or publishing our Price Paid Data
If you use or publish our Price Paid Data, you must add the following attribution statement:
Contains HM Land Registry data © Crown copyright and database right 2021. This data is licensed under the Open Government Licence v3.0.
Price Paid Data is released under the http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/" class="govuk-link">Open Government Licence (OGL). You need to make sure you understand the terms of the OGL before using the data.
Under the OGL, HM Land Registry permits you to use the Price Paid Data for commercial or non-commercial purposes. However, OGL does not cover the use of third party rights, which we are not authorised to license.
Price Paid Data contains address data processed against Ordnance Surveyâs AddressBase Premium product, which incorporates Royal Mailâs PAFÂź database (Address Data). Royal Mail and Ordnance Survey permit your use of Address Data in the Price Paid Data:
If you want to use the Address Data in any other way, you must contact Royal Mail. Email address.management@royalmail.com.
The following fields comprise the address data included in Price Paid Data:
The January 2025 release includes:
As we will be adding to the January data in future releases, we would not recommend using it in isolation as an indication of market or HM Land Registry activity. When the full dataset is viewed alongside the data weâve previously published, it adds to the overall picture of market activity.
Your use of Price Paid Data is governed by conditions and by downloading the data you are agreeing to those conditions.
Google Chrome (Chrome 88 onwards) is blocking downloads of our Price Paid Data. Please use another internet browser while we resolve this issue. We apologise for any inconvenience caused.
We update the data on the 20th working day of each month. You can download the:
These include standard and additional price paid data transactions received at HM Land Registry from 1 January 1995 to the most current monthly data.
Your use of Price Paid Data is governed by conditions and by downloading the data you are agreeing to those conditions.
The data is updated monthly and the average size of this file is 3.7 GB, you can download:
<
The Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.
LTLSensitivityToLength : LTL artefacts on length-sensitivity
Procedures and artefacts to reproduce experiments on "length" sensitivity of LTL formulas.
We use both Tapaal https://www.tapaal.net/ and ITS-Tools https://lip6.github.io/ITSTools-web/ in these experiments on models taken from the model-checking competition 2021 https://mcc.lip6.fr/.
See also companion GitHub project here : https://github.com/yanntm/LTLSensitivityToLength/
Workflow and steps
Please refer to the source of the "~/demo.sh" file for more details on the different steps.
Step 0 : download the VM provided by the conference from here : https://zenodo.org/record/5562597 Click the ".ova" to download the VM then open it and start it with VirtualBox.
Step 1 : login to the VM with user/pass : tacas22/tacas22. You can change keyboard setting by doing : right click background, Display Setting, left in "Region & Language", "Input sources", first "Add" French/your keyboard type (it is in "other") then trash "US-en" using the garbage can icon. Right-click background->"Open in terminal" will now give us a terminal with correct keyboard.
Step 2 : Download and deploy artefact in the VM. Use the Zenodo link at the bottom of this page to download and place it at the root of the tacas22 home: in /home/tacas22/. Then deploy with tar xvzf home.tgz.
Step 3 : There are several Readme with more details but the rest is done using the demo.sh so simply run it. It will enact the further steps.
Steps in the demo
Much more detail is available in the ~/demo.sh file.
Step 0 : install debian dependencies. We use R language and modern Java.
Step 1 : Analyze sensitivity to length of LTL formulas and compute reduced models. This step is handled by ITS-tools. Because the benchmark is so huge we only reproduce using three model instances (out of 2822). More examples can be run by editing "step 1" of demo.sh to add more runs.
Step 2 : run an MCC model-checker with both reduced and original model/formula pairs. Collect logs and compute a CSV summary file from these raw logs. We run both Tapaal and ITS-Tools. We did our best to simulate the conditions we ran in (we capture output in OAR.XX.out and OAR.XX.err files), but we use a cluster and OAR to reserve cpu on it in the full experiment so the match is not perfect. The actual scripts we used on the cluster are also part of this distribution however.
Step 3 : build tables from the CSV and formulas of literature on sensitivity to length in practice. This corresponds to the Table in section 4.1 of the paper. For a large set of formulas we compute whether it is stutter insensitive, shortening insensitive, lengthening insensitive, or arbitrary. This step can actually be performed after or before Step 2 it only depnds on data produced at step 1.
Step 4 : build tables and plots from the CSV of Step 2 using R. The demo script builds both the plots using the data you have just collected at step 1 and using the full data from our experiment. To make these auditable, we provide the CSV resulting from our cluster run in ~/tacas22/Rscripts/clusterLog as well as all the logs of this run in the ~/tacas22/logsCluster.tgz that were parsed to build the CSV.
Contents
We have deployed all dependencies.
~/usr/ contains local installations of the LTL manipulation library Spot and of libraried required for R (used in analysis)
~/tapaal/ contains Tapaal, configured for MCC mode. It was built from the bzr depot of Tapaal from source, following instructions helpfully provided by the authors Jiri Srba et al.
~/tacas22/Spot-Binary-Builds/ is the folder used to build Spot with appropriate flags. We used "build_cluster.sh" script. You do not need to do this however it is deployed in ~/usr/local. The companion github is https://github.com/yanntm/Spot-Binary-Builds.
~/tacas22/packages/ contains the apt-get debian packages needed to run our demo. Essentially we need R language and Java language support.
There are instructions on how to rebuild these dependencies from scratch in each subfolder. Again it is not necessary to do so with the provided archive.
Then the tools/models :
~/tacas22/ITS-Tools-MCC contains the ITS-Tools distribution as well as the inputs from the MCC'21 edition. Both of these are extracted and installed following the instruction on the three github : https://github.com/yanntm/pnmcc-models-2021 (curated/annotated models from MCC 21) https://github.com/yanntm/pnmcc-tests (For the test/runner framework) https://github.com/yanntm/ITS-Tools-MCC (For ITS-tools distributed for the MCC)
~/tacas22/LTLPatterns/ contains formulas collected from the literature as well as a script to analyze their sensitiivty and compute metrics on them. See the README in that folder for more details.
~/tacas22/Rscripts/ contains scripts to analyze the results and produce plots used in the paper. See the README in that folder for more details. It also contains the CSV produced from our cluster run (in clusterLog/) to reproduce the plots (these were built using the full logs provided in ~/tacas22/logsCluster.tgz and analyzed with the perl scripts from the ~/tacas22/ITS-Tools-MCC folder)
The requirements for this setup include:
A version of Spot : https://spot.lrde.epita.fr/ to both translate LTL to an automaton and analyze its sensitivity.
The models and formulas from the Model Checking Contest 2021. We grab these from our PNMCC Models 2021 repository that itself builds the files using the official distribution of the MCC.
Our test/runner framework for these examples, available from https://github.com/yanntm/pnmcc-tests
Then we need a MCC compatible tool that can compete in the LTL category of the contest. We used~:
A version of ITS-tools : we use the version packaged for the MCC competition, available from here : ITS-tools for MCC
A version of Tapaal : we build it from the source repositories with flags to enable MCC mode. See repository here : https://bazaar.launchpad.net/~verifypn-maintainers/verifypn/new-trunk/files/head:/Scripts/MCC21/competition-scripts and https://code.launchpad.net/verifypn
License
This work is provided under the terms of GPL v3 or more recent.
(C) Yann Thierry-Mieg, Denis Poitrenaud, Etienne Renault, Emmanuel Paviot-Adet. Sorbonne Université, CNRS. 2021.
The Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.
The Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.
The Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.
The Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.
The Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.
The Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.
The Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.
The Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.
The Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.
The Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
https://creativecommons.org/share-your-work/public-domain/pdmhttps://creativecommons.org/share-your-work/public-domain/pdm
This collection comprises unaltered data files downloaded from https://eddataexpress.ed.gov/download/data-library on February 6, 2025. The original access page consisted of a table with category filters, which provided links to data ZIP files containing the specified data fields. This table has been saved into tabular data formats here in the Index folder, with the original web links replaced with the matching ZIP filename only, which essentially replicates the functionality of the original web page in a downloadable format.In the website's underlying file structure, the original ZIP files were nested within folders named according to the format EID_####, apparently to avoid conflicts between files with the same name. These seeming duplications might have been due to updates or revisions that had to be made to a data file. To preserve this original order, the ZIP files were renamed by appending the EID number to their original file name. The files were not otherwise unzipped or altered in any way from their original state.At the time of download, the page at https://eddataexpress.ed.gov/download/data-library displayed the following two notices in red:"The COVID-19 pandemic disrupted the collection and reporting of data on EDE, beginning in SY 2019-20. The Department urges abundant caution when using the data and recommends reviewing the relevant data notes prior to use or interpretation. This includes data on state assessments, graduation rates, and chronic absenteeism.""WARNING: The data library functionality has stopped working temporarily for many SY2122 school files. Please go to the download tool page to download your data of interest. We apologize for the inconvenience."--------------------The "About Us" page from the ED Data Express website had this to say about its resources:Purpose of ED Data ExpressED Data Express is a website designed to improve the public's ability to access and explore high-value state- and district-level education data collected by the U.S. Department of Education. The site is designed to be interactive and to present the data in a clear, easy-to-use manner, with options to download information into Excel or to explore the data within the site's grant program dashboards. The site currently includes data from EDFacts, Consolidated State Performance Reports (CSPR), and the Department's Budget Service office. For more information about these topics, please visit the following web pages:https://www2.ed.gov/about/inits/ed/edfacts/index.html [see below for the text of the linked page]https://www2.ed.gov/about/offices/list/om/fs_po/ofo/budget-service.html [this URL was dead at the time of download]Using the SiteED Data Express includes two sections that allow users to access and view the data: (1) grant program data dashboards and (2) download functionality. The grant program data dashboards provide a snapshot of information on the funding, participation and performance of some of the grant programs administered by the U.S. Department of Education's Office of Elementary and Secondary Education. The dashboards are interactive and update depending on the program, state and school year selected. Additional information is provided through data notes as well as through the small "i" icon. The download functionality allows users to build customized tables of data and contain more data than what is available via the dashboards. The download functionality also allows users to download data notes which provide important caveats and contextual information to consider when using the data. Data Included and Frequency of UpdatesThe site currently includes funding, participation and performance data from school years 2010-11 to 2016-17 on formula grant programs administered in the Office of Elementary and Secondary Education. Additional data and data notes will be added to the site over time. Quality Control and Personally Identifiable InformationAll CSPR and EDFacts data are self-reported by each state. The U.S. Department of Education conducts a review of the data and provides feedback to states, but it is ultimately statesâ responsibility to verify and certify that their data are correct. Please note that during the reporting years represented on this site, the Office of Elementary and Secondary Education in collaboration with EDFacts and SEAs have wor