Facebook
Twitterhttps://saildatabank.com/data/apply-to-work-with-the-data/https://saildatabank.com/data/apply-to-work-with-the-data/
The probation data included in this dataset is sourced from National Delius (nDelius). nDelius is used for the management of offenders on Probation, or in the community. A service user (offender) is referred to nDelius by a court and an event is created in nDelius. Broadly, events are either a sentence or pre-sentence. An event can only ever have one sentence outcome (disposal). One court case can receive multiple sentences/disposals so more than one event may run at the same time.
A service user will receive one offender_id per court case however duplicates can happen by mistake. The variable estimated_offender_id uses a process of data deduplication to eliminate these duplicates and group them under the same service user cluster.
The accuracy of the source data is dependent on the quality assurance processes and local recording practices intrinsic to the source data systems used by HMPPS staff (nDelius).
The Research Accreditation Panel provides oversight of the framework that is used to accredit research projects, researchers and processing environments under the Digital Economy Act 2017 (DEA). Researchers are advised to liaise with SAIL support teams to understand the requirements and timelines involved with submitting a research project to the Research Accreditation Panel. https://uksa.statisticsauthority.gov.uk/digitaleconomyact-research-statistics/research-accreditation-panel/
Facebook
TwitterThis study includes a synthetically-generated version of the Ministry of Justice Data First Crown Courts datasets. Synthetic versions of all 43 tables in the MoJ Data First data ecosystem have been created. These versions can be used / joined in the same way as the real datasets. As well as underpinning training, synthetic datasets should enable researchers to explore research questions and to design research proposals prior to submitting these for approval. The code created during this exploration and design process should then enable initial results to be obtained as soon as data access is granted.
The Ministry of Justice Data First Crown Court defendant dataset provides data on defendants’ appearances in criminal cases before Crown Court in England & Wales from 2013, and has been extracted from XHIBIT management information system, used by His Majesty’s Courts and Tribunals Service (HMCTS) to manage cases within the Crown Court.
Please note: recent Trial and Sentencing cases are now usually recorded on a new case management system, Common Platform. These cases are not included and therefore coverage of Crown Court cases in this dataset will decrease over time (particularly for cases received from mid-2021) although the majority of cases disposed during 2021 and 2022 are captured. Appropriate coverage and time period will be considered in assessing applications to use this data.
Information on defendants’ characteristics, the main offence charged, key cases dates, processes and outcomes is included: for example, age, gender, ethnicity, offence category, hearings, please, conviction and sentencing. Cases heard at the Crown Court for Trial, Sentencing or Appeal are included. Each record in the dataset gives information about a single person and case. There is one table which gives a case summary based on the principal offence and one with a record for each offence within the case.
As part of Data First, records have been deidentified and deduplicated, using our probabilistic record linking package, Splink, so that a unique identifier is assigned to all records believed to relate to the same person, allowing for longitudinal analysis and investigation of repeat appearances. This opens up the potential to better understand court users and to build evidence on, for example, patterns associated with prolific offending and what works to reduce reoffending.
The Ministry of Justice Data First linking dataset can be used in combination with this and other Data First datasets to join up administrative records about people from across justice services to increase understanding around users’ interactions, pathways and outcomes. Cases can also be linked directly to cases appearing in the Data First magistrates’ courts defendant dataset.
Facebook
TwitterThis study includes a synthetically-generated version of the Ministry of Justice Data First Prisons dataset. Synthetic versions of all 43 tables in the MoJ Data First data ecosystem have been created. These versions can be used / joined in the same way as the real datasets. As well as underpinning training, synthetic datasets should enable researchers to explore research questions and to design research proposals prior to submitting these for approval. The code created during this exploration and design process should then enable initial results to be obtained as soon as data access is granted.
The Ministry of Justice Data First prisoner custodial journey dataset provides data on people held in custody in prisons and Young Offender Institutions (YOIs) in England and Wales and has been extracted from the management information system Prison National Offender Management Information System (P-NOMIS), used by His Majesty's Prisons and Probation Service (HMPPS) within prisons.
Data on offenders serving custodial sentences since 2011 is expected to be complete, but sentences begun before this are included. Young Offenders are included if resident at prisons or YOIs that use P-NOMIS, however, this excludes the majority of Secure Schools and Secure Training Centres.
Information is included on offender characteristics, their main offence, sentence and release: for example, age, gender, ethnicity, offence category, and key dates, providing information on movements through the system and their release and recall (if applicable). There is a separate table on safety in custody incidents involving assaults and self-harm and a table on external movements (between prisons, and into and out of prison). This includes information on the date, type and reason for the movement and the locations involved (for example specific prisons).
Each record in the dataset gives information about a single person and custodial journey. As part of Data First, records have been deidentified and deduplicated, using our probabilistic record linkage package, Splink, so that a unique identifier is assigned to all records believed to relate to the same person, allowing for longitudinal analysis and investigation of repeat appearances. This aims to improve on links already made within the prison system. This opens up the potential to better understand the prison population and address questions on, for example, patterns associated with short repeated custodial sentences and what works to reduce reoffending.
The Ministry of Justice Data First linking dataset can be used in combination with this and other Data First datasets to join up administrative records about people from across justice services to increase understanding around users' interactions, pathways and outcomes.
Facebook
Twitterhttps://www.icpsr.umich.edu/web/ICPSR/studies/9571/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/9571/terms
The United Nations began its World Crime Surveys in 1978. The first survey collected statistics on a small range of offenses and on the criminal justice process for the years 1970-1975. The second survey collected data on a wide range of offenses, offenders, and criminal justice process data for the years 1975-1980. Several factors make these two collections difficult to use in combination. Some 25 percent of those countries responding to the first survey did not respond to the second and, similarly, some 30 percent of those responding to the second survey did not respond to the first. In addition, many questions asked in the second survey were not asked in the first survey. This data collection represents the efforts of the investigators to combine, revise, and recheck the data of the first two surveys. The data are divided into two parts. Part 1 comprises all data on offenses and on some criminal justice personnel. Crime data are entered for 1970 through 1980. In most cases 1975 is entered twice, since both surveys collected data for this year. Part 2 includes data on offenders, prosecutions, convictions, and prisons. Data are entered for 1970 through 1980, for every even year.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The Open Data 500, funded by the John S. and James L. Knight Foundation (http://www.knightfoundation.org/) and conducted by the GovLab, is the first comprehensive study of U.S. companies that use open government data to generate new business and develop new products and services.
Provide a basis for assessing the economic value of government open data
Encourage the development of new open data companies
Foster a dialogue between government and business on how government data can be made more useful
The Open Data 500 study is conducted by the GovLab at New York University with funding from the John S. and James L. Knight Foundation. The GovLab works to improve people’s lives by changing how we govern, using technology-enabled solutions and a collaborative, networked approach. As part of its mission, the GovLab studies how institutions can publish the data they collect as open data so that businesses, organizations, and citizens can analyze and use this information.
The Open Data 500 team has compiled our list of companies through (1) outreach campaigns, (2) advice from experts and professional organizations, and (3) additional research.
Outreach Campaign
Mass email to over 3,000 contacts in the GovLab network
Mass email to over 2,000 contacts OpenDataNow.com
Blog posts on TheGovLab.org and OpenDataNow.com
Social media recommendations
Media coverage of the Open Data 500
Attending presentations and conferences
Expert Advice
Recommendations from government and non-governmental organizations
Guidance and feedback from Open Data 500 advisors
Research
Companies identified for the book, Open Data Now
Companies using datasets from Data.gov
Directory of open data companies developed by Deloitte
Online Open Data Userbase created by Socrata
General research from publicly available sources
The Open Data 500 is not a rating or ranking of companies. It covers companies of different sizes and categories, using various kinds of data.
The Open Data 500 is not a competition, but an attempt to give a broad, inclusive view of the field.
The Open Data 500 study also does not provide a random sample for definitive statistical analysis. Since this is the first thorough scan of companies in the field, it is not yet possible to determine the exact landscape of open data companies.
Facebook
TwitterThis dataset contains data collected within limestone cedar glades at Stones River National Battlefield (STRI) near Murfreesboro, Tennessee. This dataset represents interpolated estimates of precipitation (in decimal inches) for 120 quadrat locations (points) within 12 selected cedar glades, for a period of time from 6 days (144 hours) prior to the field visit day up until 3 days (72 hours) prior to the field visit day. At each field visit, precipitation data was collected at four rain gauges installed at STRI. Rain gauge measurements were obtained on the following dates (which correspond to 3 days prior to the fields in the dataset): February 3, 2012; March 2, 2012; March 30, 2012; April 20, 2012; May 22, 2012; June 18, 2012; July 16, 2012; August 20, 2012; September 24, 2012; November 22, 2012, December 14, 2012; January 18, 2013; February 15, 2013; March 16, 2013; April 12, 2013; and May 10, 2013. Points were classified into four groups (identified by the field "Group") and were visited on a rotating sampling schedule, such that each group of points was visited roughly once every four months. ArcGIS version10 software (Esri, Redlands, CA, USA) was used to interpolate a raster surface between these rain gauges so that estimated precipitation values could be assigned to the 120 points used for hydrologic monitoring. First, a raster file was created based on an Inverse Distance Weighted (IDW) algorithm using rain gauge data as inputs, the default output cell size, and the default power of 2. Then, cell values from the interpolated raster were extracted to the 120 hydrologic monitoring points. Missing values (points not measured on a given day) are indicated by the value -99999.Detailed descriptions of experimental design, field data collection procedures, laboratory procedures, and data analysis are presented in Cartwright (2014).References:Cartwright, J. (2014). Soil ecology of a rock outcrop ecosystem: abiotic stresses, soil respiration, and microbial community profiles in limestone cedar glades. Ph.D. dissertation, Tennessee State University.Cofer, M., Walck, J., and Hidayati, S. (2008). Species richness and exotic species invasion in Middle Tennessee cedar glades in relation to abiotic and biotic factors. The Journal of the Torrey Botanical Society, 135(4), 540–553.
Facebook
TwitterIndividuals appearing as defendants in criminal cases dealt with by the magistrates' court in England and Wales (including Youth Courts). Companies appearing as defendants have been excluded.
Facebook
TwitterPrarabdha/indian-legal-data-first dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterFind details of First Data Buyer/importer data in US (United States) with product description, price, shipment date, quantity, imported products list, major us ports name, overseas suppliers/exporters name etc. at sear.co.in.
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
The LIDAR Composite First Return DSM (Digital Surface Model) is a raster elevation model covering ~99% of England at 1m spatial resolution. The first return DSM is produced from the first or only laser pulse returned to the sensor and includes heights of objects, such as vehicles, buildings and vegetation, as well as the terrain surface where the first or only return was the ground.
Produced by the Environment Agency in 2022, the first return DSM is derived from data captured as part of our national LIDAR programme between 11 November 2016 and 5th May 2022. This programme divided England into ~300 blocks for survey over continuous winters from 2016 onwards. These surveys are merged together to create the first return LIDAR composite using a feathering technique along the overlaps to remove any small differences in elevation between surveys. Please refer to the metadata index catalgoues which show for any location which survey was used in the production of the LIDAR composite.
The first return DSM will not match in coverage or extent of the LIDAR composite last return digital surface model (LZ_DSM) as the last return DSM composite is produced from both the national LIDAR programme and Timeseries surveys.
The data is available to download as GeoTiff rasters in 5km tiles aligned to the OS National grid. The data is presented in metres, referenced to Ordinance Survey Newlyn and using the OSTN’15 transformation method. All individual LIDAR surveys going into the production of the composite had a vertical accuracy of +/-15cm RMSE.
Facebook
TwitterThis dataset consists of tweet identifiers for tweets harvested between November 28, 2016, following the election of Donald Trump through the end of the first 100 days of his administration. Data collection ended May 1, 2017.
Tweets were harvested using multiple methods described below. The total dataset consists of 218,273,152 tweets. Because of the different methods used to harvest tweets, there may be some duplication.
Facebook
TwitterPI: Brett Sanders, University of California Irvine, Department of Civil Engineering (bsanders@uci.edu) Data Questions Contact: Jochen Schubert, University of California Irvine, Department of Civil Engineering (j.schubert@uci.edu) Other Authors: Mach Katharine, University of Miami
Year Published: 2024 Years Data Collection: 2020-2024
Geography: Los Angeles, CA, USA
Two data files are provided to calculate parcel level flood risk across Los Angeles: parceldatatable.csv contains social data (e.g., population estimates, population fractions by race and ethnicity, and Neighborhood Disadvantage Index (NDI) values) primo_flooddepth_table.csv contains flood hazard data generated by PRIMo and PRIMo-Drain
parceldatatable.csv contains the following variables: id Dataset ID apn Asessors Parcel Number ...
Facebook
TwitterEleven lists of blue stellar objects (BSOs) found in the First Byurakan Survey (FBS) low-dispersion spectroscopic plates were published in the journal Astrophysics in the period 1990-1996, The selection was carried out in the region with declinations +33 deg. < delta < +45 deg. and delta > +61 degrees with a surface area of 4000 square degrees. As a result, the present catalog of the FBS blue stellar objects (BSOs) has been compiled. Its preliminary version has been available at CDS since 1999. The author has revised and updated the FBS BSOs catalog with the new data from recently published optical and multi-wavelength catalogs to give access to all available data and make further comparative studies of the properties of these objects possible. The author has made cross-correlations of the FBS BSOs catalog with the MAPS , USNO-B1.0, SDSS, and 2MASS catalogs, as well as with ROSAT, IRAS, NVSS, and FIRST catalogs , added updated SIMBAD and NED data for the objects, and provided accurate DSS1 and DSS2 positions and revised photometry. The author also checked the objects for proper motion and variability. A refined classification for the low-dispersion spectra in the Digitized First Byurakan Survey (DFBS) was carried out. The revised and updated catalog of 1103 FBS blue stellar objects is presented here. (The catalog in fact contains 1101 objects, as 2 pairs of objects turned to be identical; however, the author has kept all objects in the list on order to allow users to enter and find objects by all accepted FBS names). The FBS blue stellar objects catalog can be used to study a complete sample of white dwarfs, hot sub-dwarfs, horizontal-branch B (HBB) stars, cataclysmic variables, bright AGN, and to investigate individual interesting objects. This table was created by the HEASARC in October 2008 based on the CDS table III/258 file fbs.dat. This latter catalog supersedes the previous edition (Abrahamian et al. 1999, CDS Cat. II/223) This is a service provided by NASA HEASARC .
Facebook
TwitterIrbid First Endowment Directorate data
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
My First Projectbalance Data is a dataset for object detection tasks - it contains Objects annotations for 1,269 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterWhat percent of high school graduates enrolled in post secondary institutions in the first year after graduation?
Facebook
TwitterThe Quarterly Labour Force Survey (QLFS) is a household-based sample survey conducted by Statistics South Africa (Stats SA). It collects data on the labour market activities of individuals aged 15 years or older who live in South Africa.
National coverage
Individuals
The QLFS sample covers the non-institutional population of South Africa with one exception. The only institutional subpopulation included in the QLFS sample are individuals in worker's hostels. Persons living in private dwelling units within institutions are also enumerated. For example, within a school compound, one would enumerate the schoolmaster's house and teachers' accommodation because these are private dwellings. Students living in a dormitory on the school compound would, however, be excluded.
Sample survey data
The QLFS uses a master sampling frame that is used by several household surveys conducted by Statistics South Africa. This wave of the QLFS is based on the 2013 master frame, which was created based on the 2011 census. There are 3324 PSUs in the master frame and roughly 33 000 dwelling units.
The sample for the QLFS is based on a stratified two-stage design with probability proportional to size (PPS) sampling of PSUs in the first stage, and sampling of dwelling units (DUs) with systematic sampling in the second stage.
For each quarter of the QLFS, a quarter of the sampled dwellings are rotated out of the sample. These dwellings are replaced by new dwellings from the same PSU or the next PSU on the list. For more information see the statistical release.
Computer Assisted Telephone Interview
Facebook
TwitterThese datasets provide aggregated community risk scores for exposure to flooding using the First Street Foundation Flood Model (Version 1.3) at the county and zip code level. county_flood_score and zcta_flood_score provide the overall community risk score. county_flood_category_score and zcta_flood_category_score provide the risk score to specific categories of infrastructure. Each category; critical infrastructure, social infrastructure, residential properties, roads, and commercial properties, is a component of the overall community risk.
If you are interested in acquiring First Street flood data, you can request to access the data here. More information on First Street's flood risk statistics can be found here and information on First Street's hazards can be found here.
The following fields are in the overall risk datasets:
Attribute
Description
county_id
The county FIPS code
count
The count (#) of infrastructure facilities
flood_score
A score of 1, 2, 3, 4, or 5 is shown. Community risk rankings represent risk as Minimal, Minor (1), Moderate (2), Major (3), Severe (4) and Extreme (5). Minimal risk is a case where no facilities within a category have flood risk. County level risks are ranked based on how their total depths compare to counties across the country.
The following fields are in the category risk datasets:
Attribute
Description
FIPS
County FIPS code
ZIP_CODE
ZIP code
count
The approximate length of roads (miles) within the geography of aggregation (i.e. ZIP Code, County)
flood_score
A score (Community Risk level) of 0, 1, 2, 3, 4, or 5 is shown. Community risk levels represent risk as Minimal (0), Minor (1), Moderate (2), Major (3), Severe (4) and Extreme (5). Minimal risk is a case where no facilities within a category have flood risk. ZIP Code and County level risks are assessed based on how their total depths compare to ZIP Codes and Counties across the country.
risk_direction
A score of 1, -1, or 0 is shown. These note if flood risk is expected to increase (1), decrease (-1), or remain constant (0) over the next 30 years.
infrastructure_category_id
1= critical infrastructure, 4 = social infrastructure , 6 = residential properties, 8 - roads, 9 = commercial properties
Facebook
TwitterThe UK House Price Index is a National Statistic.
Download the full UK House Price Index data below, or use our tool to https://landregistry.data.gov.uk/app/ukhpi?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=tool&utm_term=9.30_17_01_24" class="govuk-link">create your own bespoke reports.
Datasets are available as CSV files. Find out about republishing and making use of the data.
Google Chrome is blocking downloads of our UK HPI data files (Chrome 88 onwards). Please use another internet browser while we resolve this issue. We apologise for any inconvenience caused.
This file includes a derived back series for the new UK HPI. Under the UK HPI, data is available from 1995 for England and Wales, 2004 for Scotland and 2005 for Northern Ireland. A longer back series has been derived by using the historic path of the Office for National Statistics HPI to construct a series back to 1968.
Download the full UK HPI background file:
If you are interested in a specific attribute, we have separated them into these CSV files:
http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Average-prices-2023-11.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=average_price&utm_term=9.30_17_01_24" class="govuk-link">Average price (CSV, 9.4MB)
http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Average-prices-Property-Type-2023-11.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=average_price_property_price&utm_term=9.30_17_01_24" class="govuk-link">Average price by property type (CSV, 28.2MB)
http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Sales-2023-11.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=sales&utm_term=9.30_17_01_24" class="govuk-link">Sales (CSV, 4.9MB)
http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Cash-mortgage-sales-2023-11.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=cash_mortgage-sales&utm_term=9.30_17_01_24" class="govuk-link">Cash mortgage sales (CSV, 6.9MB)
http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/First-Time-Buyer-Former-Owner-Occupied-2023-11.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=FTNFOO&utm_term=9.30_17_01_24" class="govuk-link">First time buyer and former owner occupier (CSV, 6.6MB)
http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/New-and-Old-2023-11.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=new_build&utm_term=9.30_17_01_24" class="govuk-link">New build and existing resold property (CSV, 17.2MB)
http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Indices-2023-11.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=index&utm_term=9.30_17_01_24" class="govuk-link">Index (CSV, 6.1MB)
http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Indices-seasonally-adjusted-2023-11.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=index_season_adjusted&utm_term=9.30_17_01_24" class="govuk-link">Index seasonally adjusted (CSV, 210KB)
http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Average-price-seasonally-adjusted-2023-11.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=average-price_season_adjusted&utm_term=9.30_17_01_24" class="govuk-link">Average price seasonally a
Facebook
TwitterThe First ISCCP Regional Experiments have been designed to improve data products and cloud/radiation parameterizations used in general circulation models (GCMs). Specifically, the goals of FIRE are (1) to seek the basic understanding of the interaction of physical processes in determining life cycles of cirrus and marine stratocumulus systems and the radiative properties of these clouds during their life cycles and (2) to investigate the interrelationships between ISCCP data, GCM parameterizations, and higher space and time resolution cloud data. To-date, four intensive field-observation periods were planned and executed: a cirrus IFO (October 13 - November 2, 1986); a marine stratocumulus IFO off the southwestern coast of California (June 29 - July 20, 1987); a second cirrus IFO in southeastern Kansas (November 13 - December 7, 1991); and a second marine stratocumulus IFO in the eastern North Atlantic Ocean (June 1 - June 28, 1992). Each mission combined coordinated satellite, airborne, and surface observations with modeling studies to investigate the cloud properties and physical processes of the cloud systems.This data set contains images of cirrus clouds advected over the HSRL during FIRE Cirrus 2 in Coffeyville, Kansas. These images consist of both the lidar backscatter and the depolarization ratio of backscatter radiation.
Facebook
Twitterhttps://saildatabank.com/data/apply-to-work-with-the-data/https://saildatabank.com/data/apply-to-work-with-the-data/
The probation data included in this dataset is sourced from National Delius (nDelius). nDelius is used for the management of offenders on Probation, or in the community. A service user (offender) is referred to nDelius by a court and an event is created in nDelius. Broadly, events are either a sentence or pre-sentence. An event can only ever have one sentence outcome (disposal). One court case can receive multiple sentences/disposals so more than one event may run at the same time.
A service user will receive one offender_id per court case however duplicates can happen by mistake. The variable estimated_offender_id uses a process of data deduplication to eliminate these duplicates and group them under the same service user cluster.
The accuracy of the source data is dependent on the quality assurance processes and local recording practices intrinsic to the source data systems used by HMPPS staff (nDelius).
The Research Accreditation Panel provides oversight of the framework that is used to accredit research projects, researchers and processing environments under the Digital Economy Act 2017 (DEA). Researchers are advised to liaise with SAIL support teams to understand the requirements and timelines involved with submitting a research project to the Research Accreditation Panel. https://uksa.statisticsauthority.gov.uk/digitaleconomyact-research-statistics/research-accreditation-panel/