62 datasets found
  1. d

    Street Sweeping Schedule

    • catalog.data.gov
    • data.sfgov.org
    Updated Oct 4, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.sfgov.org (2025). Street Sweeping Schedule [Dataset]. https://catalog.data.gov/dataset/street-sweeping-schedule
    Explore at:
    Dataset updated
    Oct 4, 2025
    Dataset provided by
    data.sfgov.org
    Description

    A. SUMMARY Mechanical street sweeping and street cleaning schedule managed by San Francisco Public Works. B. HOW THE DATASET IS CREATED This dataset is created by extracting all street sweeping schedule data from a Department of Public Works database, it is then geocoded to add common identifiers such as Centerline Network Number ("CNN") then published to the open data portal. C. UPDATE PROCESS This dataset will be updated on an 'as needed' basis, when sweeping schedules change. D. HOW TO USE THIS DATASET Use this dataset to understand, track, or analyze street sweeping in San Francisco.

  2. Annual Count of Summer Days - Projections (12km)

    • climatedataportal.metoffice.gov.uk
    Updated Feb 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Met Office (2023). Annual Count of Summer Days - Projections (12km) [Dataset]. https://climatedataportal.metoffice.gov.uk/datasets/TheMetOffice::annual-count-of-summer-days-projections-12km/explore?showTable=true
    Explore at:
    Dataset updated
    Feb 7, 2023
    Dataset authored and provided by
    Met Officehttp://www.metoffice.gov.uk/
    Area covered
    Description

    [Updated 28/01/25 to fix an issue in the ‘Lower’ values, which were not fully representing the range of uncertainty. ‘Median’ and ‘Higher’ values remain unchanged. The size of the change varies by grid cell and fixed period/global warming levels but the average difference between the 'lower' values before and after this update is 0.6.]What does the data show? The Annual Count of Summer Days is the number of days per year where the maximum daily temperature (the hottest point in the day) is above 25°C. It measures how many times the threshold is exceeded (not by how much) in a year. Note, the term ‘summer days’ is used to refer to the threshold and temperatures above 25°C outside the summer months also contribute to the annual count. The results should be interpreted as an approximation of the projected number of days when the threshold is exceeded as there will be many factors such as natural variability and local scale processes that the climate model is unable to represent.The Annual Count of Summer Days is calculated for two baseline (historical) periods 1981-2000 (corresponding to 0.51°C warming) and 2001-2020 (corresponding to 0.87°C warming) and for global warming levels of 1.5°C, 2.0°C, 2.5°C, 3.0°C, 4.0°C above the pre-industrial (1850-1900) period. This enables users to compare the future number of summer days to previous values. What are the possible societal impacts?An increase in the Annual Count of Summer Days indicates increased health risks from high temperatures. Impacts include:Increased heat related illnesses, hospital admissions or death for vulnerable people.Transport disruption due to overheating of railway infrastructure. Periods of increased water demand.Other metrics such as the Annual Count of Hot Summer Days (days above 30°C), Annual Count of Extreme Summer Days (days above 35°C) and the Annual Count of Tropical Nights (where the minimum temperature does not fall below 20°C) also indicate impacts from high temperatures, however they use different temperature thresholds.What is a global warming level?The Annual Count of Summer Days is calculated from the UKCP18 regional climate projections using the high emissions scenario (RCP 8.5) where greenhouse gas emissions continue to grow. Instead of considering future climate change during specific time periods (e.g. decades) for this scenario, the dataset is calculated at various levels of global warming relative to the pre-industrial (1850-1900) period. The world has already warmed by around 1.1°C (between 1850–1900 and 2011–2020), whilst this dataset allows for the exploration of greater levels of warming. The global warming levels available in this dataset are 1.5°C, 2°C, 2.5°C, 3°C and 4°C. The data at each warming level was calculated using a 21 year period. These 21 year periods are calculated by taking 10 years either side of the first year at which the global warming level is reached. This time will be different for different model ensemble members. To calculate the value for the 'Annual Count of Summer Days', an average is taken across the 21 year period. Therefore, the Annual Count of Summer Days show the number of summer days that could occur each year, for each given level of warming. We cannot provide a precise likelihood for particular emission scenarios being followed in the real world future. However, we do note that RCP8.5 corresponds to emissions considerably above those expected with current international policy agreements. The results are also expressed for several global warming levels because we do not yet know which level will be reached in the real climate as it will depend on future greenhouse emission choices and the sensitivity of the climate system, which is uncertain. Estimates based on the assumption of current international agreements on greenhouse gas emissions suggest a median warming level in the region of 2.4-2.8°C, but it could either be higher or lower than this level.What are the naming conventions and how do I explore the data? This data contains a field for each global warming level and two baselines. They are named ‘Summer Days’, the warming level or baseline, and ‘upper’ ‘median’ or ‘lower’ as per the description below. E.g. ‘Summer Days 2.5 median’ is the median value for the 2.5°C warming level. Decimal points are included in field aliases but not field names e.g. ‘Summer Days 2.5 median’ is ‘SummerDays_25_median’. To understand how to explore the data, see this page: https://storymaps.arcgis.com/stories/457e7a2bc73e40b089fac0e47c63a578Please note, if viewing in ArcGIS Map Viewer, the map will default to ‘Summer Days 2.0°C median’ values.What do the ‘median’, ‘upper’, and ‘lower’ values mean?Climate models are numerical representations of the climate system. To capture uncertainty in projections for the future, an ensemble, or group, of climate models are run. Each ensemble member has slightly different starting conditions or model set-ups. Considering all of the model outcomes gives users a range of plausible conditions which could occur in the future. For this dataset, the model projections consist of 12 separate ensemble members. To select which ensemble members to use, the Annual Count of Summer Days was calculated for each ensemble member and they were then ranked in order from lowest to highest for each location. The ‘lower’ fields are the second lowest ranked ensemble member. The ‘upper’ fields are the second highest ranked ensemble member. The ‘median’ field is the central value of the ensemble.This gives a median value, and a spread of the ensemble members indicating the range of possible outcomes in the projections. This spread of outputs can be used to infer the uncertainty in the projections. The larger the difference between the lower and upper fields, the greater the uncertainty.‘Lower’, ‘median’ and ‘upper’ are also given for the baseline periods as these values also come from the model that was used to produce the projections. This allows a fair comparison between the model projections and recent past. Useful linksThis dataset was calculated following the methodology in the ‘Future Changes to high impact weather in the UK’ report and uses the same temperature thresholds as the 'State of the UK Climate' report.Further information on the UK Climate Projections (UKCP).Further information on understanding climate data within the Met Office Climate Data Portal

  3. SF Employee Compensation

    • kaggle.com
    zip
    Updated Jan 1, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of San Francisco (2021). SF Employee Compensation [Dataset]. https://www.kaggle.com/san-francisco/sf-employee-compensation
    Explore at:
    zip(26766515 bytes)Available download formats
    Dataset updated
    Jan 1, 2021
    Dataset authored and provided by
    City of San Francisco
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Area covered
    San Francisco
    Description

    Content

    A. SUMMARY The San Francisco Controller's Office maintains a database of the salary and benefits paid to City employees since fiscal year 2013.

    B. HOW THE DATASET IS CREATED This data is summarized and presented on the Employee Compensation report hosted at http://openbook.sfgov.org, and is also available in this dataset in CSV format.

    C. UPDATE PROCESS New data is added on a bi-annual basis when available for each fiscal and calendar year.

    D. HOW TO USE THIS DATASET Before using please first review the following two resources: Data Dictionary - Can be found in 'About this dataset' section after click 'Show More' Employee Compensation FAQ - https://support.datasf.org/help/employee-compensation-faq

    Context

    This is a dataset hosted by the city of San Francisco. The organization has an open data platform found here and they update their information according the amount of data that is brought in. Explore San Francisco's Data using Kaggle and all of the data sources available through the San Francisco organization page!

    • Update Frequency: This dataset is updated annually.

    Acknowledgements

    This dataset is maintained using Socrata's API and Kaggle's API. Socrata has assisted countless organizations with hosting their open data and has been an integral part of the process of bringing more data to the public.

    Cover photo by rawpixel on Unsplash
    Unsplash Images are distributed under a unique Unsplash License.

  4. Leash-Bio-processed-dataset

    • kaggle.com
    Updated May 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    hengck23 (2024). Leash-Bio-processed-dataset [Dataset]. https://www.kaggle.com/datasets/hengck23/leash-bio-processed-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 26, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    hengck23
    Description

    Processed dataset for https://www.kaggle.com/competitions/leash-BELKA.

    For any b2z file, It is recommend to be parallel bzip decompressor (https://github.com/mxmlnkn/indexed_bzip2) for speed.

    Last update : 22-may-2024

    In summary:

    See forum discussion for details of [1],[2]: https://www.kaggle.com/competitions/leash-BELKA/discussion/492846

    [1] reduced data

    • train.reduced.parquet : 98_415_610 training SMILES and their information
    • train.bind.npz : 98_415_610 x 3 target matrix
    • test.reduced.parquet : 878_022 test SMILES
    • all_buildingblock.csv: building blocks id used in train.reduced.parquet/test.reduced.parquet
    • fold0.parquet: train_share,valid_share,valid_nonshare splits for the experiments in the discussion

    [2] extracted ECFP4 fingerprints

    • train.ecfp4.packed.npz : Features extracted using rdkit
      • AllChem.GetMorganFingerprintAsBitVect(mol, 2, 2048)
      • repack with np.packbits() to give 98_415_610 x 256 feature matrix
    • test.ecfp4.packed.npz : similarly processed for the test SMILES

    This is somehow obsolete as the competition progresses. ecfp6 gives better results and can be extracted fast with scikit-fingerprints.

    See forum discussion for details of [3]: https://www.kaggle.com/competitions/leash-BELKA/discussion/498858 https://www.kaggle.com/code/hengck23/lb6-02-graph-nn-example

    [3] graph NN processed data

    • test/train-replace-c.smiles.bytestring.bz2 : replace linker [Dy] with C. Note that these are bytestrings and not strings.
    • train-replace-c-30m.graph.pickle.**.b2z : 98_415_610 molecule graph split into 3 files. test graphs are not provided as they are be generated on the fly.

    See forum discussion for details of [4]: https://www.kaggle.com/competitions/leash-BELKA/discussion/505985 https://www.kaggle.com/code/hengck23/conforge-open-source-conformer-generator

    [4] conformer. i.e. molecule estimated xyz data

    • test-replace-c.conforge.sdf.bz2 : conformer in sdf file. you can read the file using rdkit Chem.SDMolSupplier().
    • test-replace-c.conforge.status.parquet:
      • 'status col' shows the status of conformer. 0 means success. for failure cases, sdf store a dummy 'CC' molecule.
      • 'idx col' shows the idx (primary key) to test.reduced.parquet. use this to retrieve SMILES strings. Note that conformer is based on test-replace-c.smiles.bytestring.bz2, i.e. [Dy] is replaced by C.
    • train-replace-c.sub-[split].conforge.sdf.bz2/status.parquet: smiliar format as describe above. [split] are:
      • train: 1000250+(1001610*3) molecules
      • valid: 40000
      • nonshare: about 61674
  5. d

    August 2021 data-update for "Updated science-wide author databases of...

    • elsevier.digitalcommonsdata.com
    Updated Oct 19, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeroen Baas (2021). August 2021 data-update for "Updated science-wide author databases of standardized citation indicators" [Dataset]. http://doi.org/10.17632/btchxktzyw.3
    Explore at:
    Dataset updated
    Oct 19, 2021
    Authors
    Jeroen Baas
    License

    Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
    License information was derived automatically

    Description

    Citation metrics are widely used and misused. We have created a publicly available database of over 100,000 top-scientists that provides standardized information on citations, h-index, co-authorship adjusted hm-index, citations to papers in different authorship positions and a composite indicator. Separate data are shown for career-long and single year impact. Metrics with and without self-citations and ratio of citations to citing papers are given. Scientists are classified into 22 scientific fields and 176 sub-fields. Field- and subfield-specific percentiles are also provided for all scientists who have published at least 5 papers. Career-long data are updated to end-of-2020. The selection is based on the top 100,000 by c-score (with and without self-citations) or a percentile rank of 2% or above.

    The dataset and code provides an update to previously released version 1 data under https://doi.org/10.17632/btchxktzyw.1; The version 2 dataset is based on the May 06, 2020 snapshot from Scopus and is updated to citation year 2019 available at https://doi.org/10.17632/btchxktzyw.2

    This version (3) is based on the Aug 01, 2021 snapshot from Scopus and is updated to citation year 2020.

  6. Annual Growing Degree Days - Projections (12km)

    • climatedataportal.metoffice.gov.uk
    Updated May 22, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Met Office (2023). Annual Growing Degree Days - Projections (12km) [Dataset]. https://climatedataportal.metoffice.gov.uk/datasets/TheMetOffice::annual-growing-degree-days-projections-12km/explore?showTable=true
    Explore at:
    Dataset updated
    May 22, 2023
    Dataset authored and provided by
    Met Officehttp://www.metoffice.gov.uk/
    Area covered
    Description

    [Updated 28/01/25 to fix an issue in the ‘Lower’ values, which were not fully representing the range of uncertainty. ‘Median’ and ‘Higher’ values remain unchanged. The size of the change varies by grid cell and fixed period/global warming levels but the average percentage change between the 'lower' values before and after this update is -1%.]What does the data show? A Growing Degree Day (GDD) is a day in which the average temperature is above 5.5°C. It is the number of degrees above this threshold that counts as a Growing Degree Day. For example if the average temperature for a specific day is 6°C, this would contribute 0.5 Growing Degree Days to the annual sum, alternatively an average temperature of 10.5°C would contribute 5 Growing Degree Days. Given the data shows the annual sum of Growing Degree Days, this value can be above 365 in some parts of the UK.Annual Growing Degree Days are calculated for two baseline (historical) periods 1981-2000 (corresponding to 0.51°C warming) and 2001-2020 (corresponding to 0.87°C warming) and for global warming levels of 1.5°C, 2.0°C, 2.5°C, 3.0°C, 4.0°C above the pre-industrial (1850-1900) period. This enables users to compare the future number of GDD to previous values. What are the possible societal impacts?Annual Growing Degree Days indicate if conditions are suitable for plant growth. An increase in GDD can indicate larger crop yields due to increased crop growth from warm temperatures, but crop growth also depends on other factors. For example, GDD do not include any measure of rainfall/drought, sunlight, day length or wind, species vulnerability, or plant dieback in extremely high temperatures. GDD can indicate increased crop growth until temperatures reach a critical level above which there are detrimental impacts on plant physiology.GDD does not estimate the growth of specific species and is not a measure of season length.What is a global warming level?Annual Growing Degree Days are calculated from the UKCP18 regional climate projections using the high emissions scenario (RCP 8.5) where greenhouse gas emissions continue to grow. Instead of considering future climate change during specific time periods (e.g. decades) for this scenario, the dataset is calculated at various levels of global warming relative to the pre-industrial (1850-1900) period. The world has already warmed by around 1.1°C (between 1850–1900 and 2011–2020), whilst this dataset allows for the exploration of greater levels of warming. The global warming levels available in this dataset are 1.5°C, 2°C, 2.5°C, 3°C and 4°C. The data at each warming level was calculated using a 21 year period. These 21 year periods are calculated by taking 10 years either side of the first year at which the global warming level is reached. This time will be different for different model ensemble members. To calculate the value for the Annual Growing Degree Days, an average is taken across the 21 year period. Therefore, the Annual Growing Degree Days show the number of growing degree days that could occur each year, for each given level of warming. We cannot provide a precise likelihood for particular emission scenarios being followed in the real world future. However, we do note that RCP8.5 corresponds to emissions considerably above those expected with current international policy agreements. The results are also expressed for several global warming levels because we do not yet know which level will be reached in the real climate as it will depend on future greenhouse emission choices and the sensitivity of the climate system, which is uncertain. Estimates based on the assumption of current international agreements on greenhouse gas emissions suggest a median warming level in the region of 2.4-2.8°C, but it could either be higher or lower than this level.What are the naming conventions and how do I explore the data?This data contains a field for each global warming level and two baselines. They are named 'GDD' (Growing Degree Days), the warming level or baseline, and ‘upper’ ‘median’ or ‘lower’ as per the description below. E.g. ‘GDD 2.5 median’ is the median value for the 2.5°C projection. Decimal points are included in field aliases but not field names e.g. ‘GDD 2.5 median’ is ‘GDD_25_median’. To understand how to explore the data, see this page: https://storymaps.arcgis.com/stories/457e7a2bc73e40b089fac0e47c63a578Please note, if viewing in ArcGIS Map Viewer, the map will default to ‘GDD 2.0°C median’ values.What do the ‘median’, ‘upper’, and ‘lower’ values mean?Climate models are numerical representations of the climate system. To capture uncertainty in projections for the future, an ensemble, or group, of climate models are run. Each ensemble member has slightly different starting conditions or model set-ups. Considering all of the model outcomes gives users a range of plausible conditions which could occur in the future. For this dataset, the model projections consist of 12 separate ensemble members. To select which ensemble members to use, Annual Growing Degree Days were calculated for each ensemble member and they were then ranked in order from lowest to highest for each location. The ‘lower’ fields are the second lowest ranked ensemble member. The ‘upper’ fields are the second highest ranked ensemble member. The ‘median’ field is the central value of the ensemble.This gives a median value, and a spread of the ensemble members indicating the range of possible outcomes in the projections. This spread of outputs can be used to infer the uncertainty in the projections. The larger the difference between the lower and upper fields, the greater the uncertainty.‘Lower’, ‘median’ and ‘upper’ are also given for the baseline periods as these values also come from the model that was used to produce the projections. This allows a fair comparison between the model projections and recent past. Useful linksThis dataset was calculated following the methodology in the ‘Future Changes to high impact weather in the UK’ report and uses the same temperature thresholds as the 'State of the UK Climate' report.Further information on the UK Climate Projections (UKCP).Further information on understanding climate data within the Met Office Climate Data Portal.

  7. D

    Campaign Finance - Transactions

    • data.sfgov.org
    • catalog.data.gov
    csv, xlsx, xml
    Updated Dec 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Campaign Finance - Transactions [Dataset]. https://data.sfgov.org/City-Management-and-Ethics/Campaign-Finance-Transactions/pitq-e56w
    Explore at:
    csv, xml, xlsxAvailable download formats
    Dataset updated
    Dec 2, 2025
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Description

    A. SUMMARY Transactions from FPPC Forms 460, 461, 496, 497, and 450. This dataset combines all schedules, pages, and includes unitemized totals. Only transactions from the "most recent" version of a filing (original/amendment) appear here.

    B. HOW THE DATASET IS CREATED Committees file campaign statements with the Ethics Commission on a periodic basis. Those statements are stored with the Commission's data provider. Data is generally presented as-filed by committees.

    If a committee files an amendment, the data from that filing completely replaces the original and any prior amendments in the filing sequence.

    C. UPDATE PROCESS Each night starting at midnight Pacific time a script runs to check for new filings with the Commission's database, and updates this dataset with transactions from new filings. The update process can take a variable amount of time to complete. Viewing or downloading this dataset while the update is running may result in incomplete data, therefore it is highly recommended to view or download this data before midnight or after 8am.

    During the update, some fields are copied from the Filings dataset into this dataset for viewing convenience. The copy process may occasionally fail for some transactions due to timing issues but should self-correct the following day. Transactions with a blank 'Filing Id Number' or 'Filing Date' field are such transactions, but can be joined with the appropriate record using the 'Filing Activity Nid' field shared between Filing and Transaction datasets.

    D. HOW TO USE THIS DATASET
    Transactions from rejected filings are not included in this dataset. Transactions from many different FPPC forms and schedules are combined in this dataset, refer to the column "Form Type" to differentiate transaction types. Properties suffixed with "-nid" can be used to join the data between Filers, Filings, and Transaction datasets. Refer to the Ethics Commission's webpage for more information. Fppc Form460 is organized into Schedules as follows:

    • A: Monetary Contributions Received
    • B1: Loans Received
    • B2: Loan Guarantors
    • C: Nonmonetary Contributions Received
    • D: Summary of Expenditures Supporting/Opposing Other Candidates, Measures and Committees
    • E: Payments Made
    • F: Accrued Expenses (Unpaid Bills)
    • G: Payments Made by an Agent or Independent Contractor (on Behalf of This Committee)
    • H: Loans Made to Others
    • I: Miscellaneous Increases to Cash

    RELATED DATASETS

  8. s

    Annual Budget 2013 Table C FCC - Dataset - data.smartdublin.ie

    • data.smartdublin.ie
    Updated Jun 28, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Annual Budget 2013 Table C FCC - Dataset - data.smartdublin.ie [Dataset]. https://data.smartdublin.ie/dataset/annual-budget-2013-table-c-fcc2
    Explore at:
    Dataset updated
    Jun 28, 2021
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains data from the Council’s Annual Budget. The budget is comprised of Tables A to F and Appendix 1. Each table is represented by a separate data file.Table C is the Calculation of the Annual Rate on Valuation for the Financial Year for Balbriggan Town Council. It contains –Estimate of ‘Money Demanded’Adopted ‘Money Demanded’Estimated ‘Irrecoverable rates and cost of collection’Adopted ‘Irrecoverable rates and cost of collection’Total Sum to be Raised is the sum of ‘Money Demanded’ and ‘Irrecoverable rates and cost of collection’‘Annual Rate on Valuation to meet Total Sum to be Raised’This dataset is used to create Table C in the published Annual Budget document, which can be found at www.fingal.ieThe data is best understood by comparing it to Table C.Data fields for Table C are as follows –Doc : Table ReferenceHeading : Indicates sections in the Table - Table C is comprised of one section, therefore Heading value for all records = 1Ref : Town ReferenceDesc : Town DescriptionMD_Est : Money Demanded EstimatedMD_Adopt : Money Demanded AdoptedIR_Est : Irrecoverable rates and cost of collection EstimatedIR_Adopt : Irrecoverable rates and cost of collection AdoptedNEV : Annual Rate on Valuation to meet Total Sum to be Raised

  9. C-MAPSS Aircraft Engine Simulator Data - Dataset - NASA Open Data Portal

    • data.nasa.gov
    Updated Sep 22, 2010
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2010). C-MAPSS Aircraft Engine Simulator Data - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/c-mapss-aircraft-engine-simulator-data
    Explore at:
    Dataset updated
    Sep 22, 2010
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    SPECIAL NOTE: C-MAPSS and C-MAPSS40K ARE CURRENTLY UNAVAILABLE FOR DOWNLOAD. Glenn Research Center management is reviewing the availability requirements for these software packages. We are working with Center management to get the review completed and issues resolved in a timely manner. We will post updates on this website when the issues are resolved. We apologize for any inconvenience. Please contact Jonathan Litt, jonathan.s.litt@nasa.gov, if you have any questions in the meantime. Subject Area: Engine Health Description: This data set was generated with the C-MAPSS simulator. C-MAPSS stands for 'Commercial Modular Aero-Propulsion System Simulation' and it is a tool for the simulation of realistic large commercial turbofan engine data. Each flight is a combination of a series of flight conditions with a reasonable linear transition period to allow the engine to change from one flight condition to the next. The flight conditions are arranged to cover a typical ascent from sea level to 35K ft and descent back down to sea level. The fault was injected at a given time in one of the flights and persists throughout the remaining flights, effectively increasing the age of the engine. The intent is to identify which flight and when in the flight the fault occurred. How Data Was Acquired: The data provided is from a high fidelity system level engine simulation designed to simulate nominal and fault engine degradation over a series of flights. The simulated data was created with a Matlab Simulink tool called C-MAPSS. Sample Rates and Parameter Description: The flights are full flight recordings sampled at 1 Hz and consist of 30 engine and flight condition parameters. Each flight contains 7 unique flight conditions for an approximately 90 min flight including ascent to cruise at 35K ft and descent back to sea level. The parameters for each flight are the flight conditions, health indicators, measurement temperatures and pressure measurements. Faults/Anomalies: Faults arose from the inlet engine fan, the low pressure compressor, the high pressure compressor, the high pressure turbine and the low pressure turbine.

  10. S

    The global industrial value-added dataset under different global change...

    • scidb.cn
    Updated Aug 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Song Wei; li huan huan; Duan Jianping; Li Han; Xue Qian; Zhang Xuyang (2024). The global industrial value-added dataset under different global change scenarios (2010, 2030, and 2050) [Dataset]. http://doi.org/10.57760/sciencedb.11406
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 6, 2024
    Dataset provided by
    Science Data Bank
    Authors
    Song Wei; li huan huan; Duan Jianping; Li Han; Xue Qian; Zhang Xuyang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description
    1. Temporal Coverage of Data: The data collection periods are 2010, 2030, and 2050.2. Spatial Coverage and Projection:Spatial Coverage: GlobalLongitude: -180° - 180°Latitude: -90° - 90°Projection: GCS_WGS_19843. Disciplinary Scope: The data pertains to the fields of Earth Sciences and Geography.4. Data Volume: The total data volume is approximately 31.5 MB.5. Data Type: Raster (GeoTIFF)6. Thumbnail (illustrating dataset content or observation process/scene): · 7. Field (Feature) Name Explanation:a. Name Explanation: IND: Industrial Value Addedb. Unit of Measurement: Unit: US Dollars (USD)8. Data Source Description:a. Remote Sensing Data:2010 Global Vegetation Index data (Enhanced Vegetation Index, EVI, from MODIS monthly average data) and 2010 Nighttime Light Remote Sensing data (DMSP/OLS)b. Meteorological Data:From the CMCC-CM model in the Fifth International Coupled Model Intercomparison Project (CMIP5) published by the United Nations Intergovernmental Panel on Climate Change (IPCC)c. Statistical Data:From the World Development Indicators dataset of the World Bank and various national statistical agenciesd. Gross Domestic Product Data:Sourced from the project "Study on the Harmful Processes of Population and Economic Systems under Global Change" under the National Key R&D Program "Mechanisms and Assessment of Risks in Population and Economic Systems under Global Change," led by Researcher Sun Fubao at the Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciencese. Other Data:Rivers, roads, settlements, and DEM, sourced from the National Oceanic and Atmospheric Administration (NOAA), Global Risk Data Platform, and Natural Earth9. Data Processing Methods(1) Spatialization of Baseline Industrial Value Added: Using 2010 global EVI vegetation index data and nighttime light remote sensing data, we addressed the oversaturation issue in nighttime light data by constructing an adjusted nighttime light index to obtain the optimal global light data. The EANTIL model was developed using NTL, NTLn, and EVI data, with the following formula:Here, EANTLI represents the adjusted nighttime light index, NTL represents the original nighttime light intensity value, and NTLn represents the normalized nighttime light intensity value. Based on the optimal light index EANTLI and the industrial value-added data from the World Bank, we constructed a regression allocation model to derive industrial value added (I), generating the global 2010 industrial value-added data with the formula:Here, I represents the industrial value added for each grid cell, and Ii represents the industrial value added for each country, EANTLi derived from ArcGIS statistical analysis and the regression allocation model.(2) Spatial Boundaries for Future Industrial Value Added: Using the Logistic-CA-Markov simulation principle and global land use data from 2010 and 2015 (from the European Space Agency), we simulated national land use changes for 2030 and 2050 and extracted urban land data as the spatial boundaries for future industrial value added. To comprehensively characterize the influence of different factors on land use and considering the research scale, we selected elevation, slope, population, GDP, distance to rivers, and distance to roads as land use driving factors. Accuracy validation using global 2015 land use data showed an average accuracy of 91.89%.(3) Estimation of Future Industrial Value Added: Based on machine learning and using the random forest model, we constructed spatialization models for industrial value added under different climate change scenarios: Here, tem represents temperature, prep represents precipitation, GDP represents national economic output, L represents urban land, D represents slope, and P represents population. The random forest model was constructed using factors such as 2010 industrial value added, urban land distribution, elevation, slope, distances to rivers, roads, railways (considering transportation), and settlements (considering noise and environmental pollution from industrial buildings), along with temperature and precipitation as climate scenario data. Except for varying temperature and precipitation values across scenarios, other variables remained constant. The model comprised 100 decision trees, with each iteration randomly selecting 90% of the samples for model construction and using the remaining 10% as test data, achieving a training sample accuracy of 0.94 and a test sample accuracy of 0.81.By analyzing the proportion of industrial value added to GDP (average from 2000 to 2020, data from the World Bank) and projected GDP under future Shared Socioeconomic Pathways (SSPs), we derived future industrial value added for each country under different SSP scenarios. Using these projections, we constructed regression models to allocate future industrial value added proportionally, resulting in spatial distribution data for 2030 and 2050 under different SSP scenarios.10. Applications and Achievements of the Dataseta. Primary Application Areas: This dataset is mainly applied in environmental protection, ecological construction, pollution prevention and control, and the prevention and forecasting of natural disasters.b. Achievements in Application (Awards, Published Reports and Articles):Achievements: Developed a method for downscaling national-scale industrial value-added data by integrating DMSP/OLS nighttime light data, vegetation distribution, and other data. Published the global industrial value-added dataset.
  11. Data from: A large EEG database with users' profile information for motor...

    • data.europa.eu
    • zenodo.org
    unknown
    Updated Jan 8, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2023). A large EEG database with users' profile information for motor imagery Brain-Computer Interface research [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-7554429?locale=en
    Explore at:
    unknownAvailable download formats
    Dataset updated
    Jan 8, 2023
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context : We share a large database containing electroencephalographic signals from 87 human participants, with more than 20,800 trials in total representing about 70 hours of recording. It was collected during brain-computer interface (BCI) experiments and organized into 3 datasets (A, B, and C) that were all recorded following the same protocol: right and left hand motor imagery (MI) tasks during one single day session. It includes the performance of the associated BCI users, detailed information about the demographics, personality and cognitive user’s profile, and the experimental instructions and codes (executed in the open-source platform OpenViBE). Such database could prove useful for various studies, including but not limited to: 1) studying the relationships between BCI users' profiles and their BCI performances, 2) studying how EEG signals properties varies for different users' profiles and MI tasks, 3) using the large number of participants to design cross-user BCI machine learning algorithms or 4) incorporating users' profile information into the design of EEG signal classification algorithms. Sixty participants (Dataset A) performed the first experiment, designed in order to investigated the impact of experimenters' and users' gender on MI-BCI user training outcomes, i.e., users performance and experience, (Pillette & al). Twenty one participants (Dataset B) performed the second one, designed to examined the relationship between users' online performance (i.e., classification accuracy) and the characteristics of the chosen user-specific Most Discriminant Frequency Band (MDFB) (Benaroch & al). The only difference between the two experiments lies in the algorithm used to select the MDFB. Dataset C contains 6 additional participants who completed one of the two experiments described above. Physiological signals were measured using a g.USBAmp (g.tec, Austria), sampled at 512 Hz, and processed online using OpenViBE 2.1.0 (Dataset A) & OpenVIBE 2.2.0 (Dataset B). For Dataset C, participants C83 and C85 were collected with OpenViBE 2.1.0 and the remaining 4 participants with OpenViBE 2.2.0. Experiments were recorded at Inria Bordeaux sud-ouest, France. Duration : Each participant's folder is composed of approximately 48 minutes EEG recording. Meaning six 7-minutes runs and a 6-minutes baseline. Documents Instructions: checklist read by experimenters during the experiments. Questionnaires: the Mental Rotation test used, the translation of 4 questionnaires, notably the Demographic and Social information, the Pre and Post-session questionnaires, and the Index of Learning style. English and french version Performance: The online OpenViBE BCI classification performances obtained by each participant are provided for each run, as well as answers to all questionnaires Scenarios/scripts : set of OpenViBE scenarios used to perform each of the steps of the MI-BCI protocol, e.g., acquire training data, calibrate the classifier or run the online MI-BCI Database : raw signals Dataset A : N=60 participants Dataset B : N=21 participants Dataset C : N=6 participants

  12. R

    Data from: Change Counter Dataset

    • universe.roboflow.com
    zip
    Updated Aug 12, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Golden Gorillas Batch C (2022). Change Counter Dataset [Dataset]. https://universe.roboflow.com/golden-gorillas-batch-c/change-counter/dataset/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 12, 2022
    Dataset authored and provided by
    Golden Gorillas Batch C
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Coins Bounding Boxes
    Description

    Change Counter

    ## Overview
    
    Change Counter is a dataset for object detection tasks - it contains Coins annotations for 2,440 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  13. E

    Data from: Global hydrological dataset of daily streamflow data from the...

    • catalogue.ceh.ac.uk
    • hosted-metadata.bgs.ac.uk
    • +3more
    zip
    Updated May 28, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    S. Turner; J. Hannaford; L.J. Barker; G. Suman; R. Armitage; A. Killeen; A. Griffin; H. Davies; A. Kumar; H. Dixon; M.T.D. Albuquerque; N. Almeida Ribeiro; C. Alvarez-Garreton; E. Amoussou; B. Arheimer; Y. Asano; T. Berezowski; A. Bodian; H. Boutaghane; R. Capell; H. Dakhaoui; J. Daňhelka; H.X. Do; C. Ekkawatpanit; E.M. El Khalki; A.K. Fleig; R. Fonseca; J.D. Giraldo-Osorio; A.B.T. Goula; M. Hanel; G Hodgkins; S. Horton; C. Kan; D.G. Kingston; G. Laaha; R. Laugesen; W. Lopes; S. Mager; Y. Markonis; L. Mediero; G. Midgley; C. Murphy; P. O'Connor; A.I. Pedersen; H.T. Pham; M. Piniewski; M. Rachdane; B. Renard; M.E. Saidi; P. Schmocker-Facker; K. Stahl; M. Thyler; M. Toucher; Y. Tramblay; J. Uusikivi; N. Venegas-Cordero; S. Vissesri; A. Watson; S. Westra; P.H. Whitfield (2024). Global hydrological dataset of daily streamflow data from the Reference Observatory of Basins for INternational hydrological climate change detection (ROBIN), 1863 - 2022 [Dataset]. http://doi.org/10.5285/3b077711-f183-42f1-bac6-c892922c81f4
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 28, 2024
    Dataset provided by
    NERC EDS Environmental Information Data Centre
    Authors
    S. Turner; J. Hannaford; L.J. Barker; G. Suman; R. Armitage; A. Killeen; A. Griffin; H. Davies; A. Kumar; H. Dixon; M.T.D. Albuquerque; N. Almeida Ribeiro; C. Alvarez-Garreton; E. Amoussou; B. Arheimer; Y. Asano; T. Berezowski; A. Bodian; H. Boutaghane; R. Capell; H. Dakhaoui; J. Daňhelka; H.X. Do; C. Ekkawatpanit; E.M. El Khalki; A.K. Fleig; R. Fonseca; J.D. Giraldo-Osorio; A.B.T. Goula; M. Hanel; G Hodgkins; S. Horton; C. Kan; D.G. Kingston; G. Laaha; R. Laugesen; W. Lopes; S. Mager; Y. Markonis; L. Mediero; G. Midgley; C. Murphy; P. O'Connor; A.I. Pedersen; H.T. Pham; M. Piniewski; M. Rachdane; B. Renard; M.E. Saidi; P. Schmocker-Facker; K. Stahl; M. Thyler; M. Toucher; Y. Tramblay; J. Uusikivi; N. Venegas-Cordero; S. Vissesri; A. Watson; S. Westra; P.H. Whitfield
    License

    https://eidc.ac.uk/licences/ogl/plainhttps://eidc.ac.uk/licences/ogl/plain

    Time period covered
    Jan 1, 1863 - Dec 31, 2022
    Area covered
    Earth
    Dataset funded by
    Natural Environment Research Councilhttps://www.ukri.org/councils/nerc
    Description

    The Reference Observatory of Basins for INternational hydrological climate change detection (ROBIN) dataset is a global hydrological dataset containing publicly available daily flow data for 2,386 gauging stations across the globe which have natural or near-natural catchments. Metadata is also provided alongside these stations for the Full ROBIN Dataset consisting of 3,060 gauging stations. Data were quality controlled by the central ROBIN team before being added to the dataset, and two levels of data quality are applied to guide users towards appropriate the data usage. Most records have data of at least 40 years with minimal missing data with data records starting in the late 19th Century for some sites through to 2022. ROBIN represents a significant advance in global-scale, accessible streamflow data. The project was funded the UK Natural Environment Research Council Global Partnership Seedcorn Fund - NE/W004038/1 and the NC-International programme [NE/X006247/1] delivering National Capability

  14. Coastal Change Analysis Program (C-CAP) Regional Land Cover Data and Change...

    • catalog.data.gov
    • datasets.ai
    • +3more
    Updated Apr 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NOAA Office for Coastal Management (Point of Contact, Custodian) (2025). Coastal Change Analysis Program (C-CAP) Regional Land Cover Data and Change Data [Dataset]. https://catalog.data.gov/dataset/coastal-change-analysis-program-c-cap-regional-land-cover-data-and-change-data2
    Explore at:
    Dataset updated
    Apr 15, 2025
    Dataset provided by
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    Description

    The NOAA Coastal Change Analysis Program (C-CAP) produces national standardized land cover and change products for the coastal regions of the U.S. C-CAP products inventory coastal intertidal areas, wetlands, and adjacent uplands with the goal of monitoring changes in these habitats, on a one-to-five year repeat cycle. The timeframe for this metadata is reported as 1985 - 2010-Era, but the actual dates of the Landsat imagery used to create the land cover may have been acquired a few years before or after each era. These maps are developed utilizing Landsat Thematic Mapper imagery, and can be used to track changes in the landscape through time. This trend information gives important feedback to managers on the success or failure of management policies and programs and aid in developing a scientific understanding of the Earth system and its response to natural and human-induced changes. This understanding allows for the prediction of impacts due to these changes and the assessment of their cumulative effects, helping coastal resource managers make more informed regional decisions. NOAA C-CAP is a contributing member to the Multi-Resolution Land Characteristics consortium and C-CAP products are included as the coastal expression of land cover within the National Land Cover Database.

  15. Vocational qualifications dataset

    • gov.uk
    • tnaqa.mirrorweb.com
    • +1more
    Updated Nov 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ofqual (2025). Vocational qualifications dataset [Dataset]. https://www.gov.uk/government/statistical-data-sets/vocational-qualifications-dataset
    Explore at:
    Dataset updated
    Nov 20, 2025
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Ofqual
    Description

    This dataset covers vocational qualifications starting 2012 to present for England.

    The dataset is updated every quarter. Data for previous quarters may be revised to insert late data or to correct an error. Updates also reflect where qualifications were re-categorised to a different type, level, sector subject area or awarding organisation. Where a quarterly update includes revisions to data for previous quarters, a table of revisions is published in the vocational and other qualifications quarterly release

    In the dataset, the number of certificates issued are rounded to the nearest 5 and values less than 5 appear as ‘Fewer than 5’ to preserve confidentiality (and a 0 represents no certificates).

    Where a qualification has been owned by more than one awarding organisation at different points in time, a separate row is given for each organisation.

    Background information and key headlines for every quarter are published in in the vocational and other qualifications quarterly release.

    For any queries contact us at data.analytics@ofqual.gov.uk.

  16. D

    Dataset inventory

    • data.sfgov.org
    • gimi9.com
    • +1more
    csv, xlsx, xml
    Updated Dec 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DataSF (2025). Dataset inventory [Dataset]. https://data.sfgov.org/w/y8fp-fbf5/ikek-yizv?cur=Es9l-mSV-5F
    Explore at:
    csv, xlsx, xmlAvailable download formats
    Dataset updated
    Dec 1, 2025
    Dataset authored and provided by
    DataSF
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Description

    A. SUMMARY The dataset inventory provides a list of data maintained by departments that are candidates for open data publishing or have already been published and is collected in accordance with Chapter 22D of the Administrative Code. The inventory will be used in conjunction with department publishing plans to track progress toward meeting plan goals for each department.

    B. HOW THE DATASET IS CREATED This dataset is collated through 2 ways: 1. Ongoing updates are made throughout the year to reflect new datasets, this process involves DataSF staff reconciling publishing records after datasets are published 2. Annual bulk updates - departments review their inventories and identify changes and updates and submit those to DataSF for a once a year bulk update - not all departments will have changes or their changes will have been captured over the course of the prior year already as ongoing updates

    C. UPDATE PROCESS The dataset is synced automatically daily, but the underlying data changes manually throughout the year as needed

    D. HOW TO USE THIS DATASET Interpreting dates in this dataset This dataset has 2 dates: 1. Date Added - when the dataset was added to the inventory itself 2. First Published - the open data portal automatically captures the date the dataset was first created, this is that system generated date

    Note that in certain cases we may have published a dataset prior to it being added to the inventory. We do our best to have an accurate accounting of when something was added to this inventory and when it was published. In most cases the inventory addition will happen prior to publishing, but in certain cases it will be published and we will have missed updating the inventory as this is a manual process.

    First published will give an accounting of when it was actually available on the open data catalog and date added when it was added to this list.

    E. RELATED DATASETS

  17. Inventory of citywide enterprise systems of record
  18. Dataset Inventory: Column-Level Details

  • a

    Active Streets (CNNs, from DataSF, pulled nightly)

    • hub.arcgis.com
    Updated Sep 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City and County of San Francisco (2025). Active Streets (CNNs, from DataSF, pulled nightly) [Dataset]. https://hub.arcgis.com/datasets/sfgov::active-streets-cnns-from-datasf-pulled-nightly?uiVersion=content-views
    Explore at:
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    City and County of San Francisco
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Area covered
    Description

    A. SUMMARY A list of street centerlines, including both active and retired streets. These centerlines are identified by their Centerline Network Number ("CNN").B. HOW THE DATASET IS CREATED This data is extracted from the Department of Public Works Basemap. Supervisor District and Analysis Neighborhood are added during the loading process. These boundaries utilize the centroid (middle) of the line to determine the district or neighborhood.C. UPDATE PROCESS This dataset refreshes daily, though the data may not change every day.D. HOW TO USE THIS DATASET Note 1: The Class Code field is used for symbolization:1 = Freeway2 = Major street/Highway3 = Arterial street4 = Collector Street5 = Residential Street6 = Freeway Ramp0 = Other (private streets, paper street, etc.)E. RELATED DATASETS Understanding street-level dataData pushed to ArcGIS Online on November 10, 2025 at 3:25 AM by SFGIS.Data from: https://data.sfgov.org/d/3psu-pn9hDescription of dataset columns:

     cnn
     Centerline Network Number - unique identifier for dataset
    
    
     lf_fadd
     From address number on left side of street, the lowest number in the address range
    
    
     lf_toadd
     To address number on left side of street, the highest number in the address range
    
    
     rt_fadd
     From address number on right side of street, the lowest number in the address range
    
    
     rt_toadd
     To address number on right side of street, the highest number in the address range
    
    
     street
     Street name without street type
    
    
     st_type
     Street Type (AVE, ST, BLVD, et al.)
    
    
     f_st
     The street name of the segment intersects at its beginning.
    
    
     t_st
     The street name of the segment intersects at its end.
    
    
     f_node_cnn
     Centerline Network Number for the node/intersection that the street segment begins from.
    
    
     t_node_cnn
     Centerline Network Number for the node/intersection that the street segment ends on.
    
    
     accepted
     Accepted by City and County of San Francisco for maintenance.
    
    
     active
     Active street segment, i.e., not retired.
    
    
     classcode
     Classification code for street segment. Used for symbolization: 1 = Freeway 2 = Major street/Highway 3 = Arterial street 4 = Collector Street 5 = Residential Street 6 = Freeway Ramp 0 = Other (private streets, paper street, etc.)
    
    
     date_added
     Date added to dataset by Public Works.
    
    
     date_altered
     Date altered to dataset by Public Works.
    
    
     date_dropped
     Date dropped to dataset by Public Works.
    
    
     gds_chg_id_add
     The internal change transaction id when the segment was added.
    
    
     gds_chg_id_altered
     The internal change transaction id when the segment was altered.
    
    
     gds_chg_id_dropped
     The internal change transaction id when the segment was dropped/retired.
    
    
     jurisdiction
     Agency with jurisdiction over the segment, if any.
    
    
     layer
     Derived from the source AutoCAD drawing, this field indicates the category of segment. Definitions for each of the values: Freeways such as 80, 280 and 101. Paper, the centerline segment is present on Assessor and/or Public Works map, but is not an actual street in reality. Paper_fwys, the centerline segment is present on Assessor and/or Public Works map, but is not an actual street in reality, and is under or near a freeway. Paper_water, the centerline segment is present on Assessor and/or Public Works map, but is not an actual street in reality, and is under water in the Bay. PARKS, street segement maintained by Recreation and Park Department, e.g., in Golden Gate Park. Parks_NPS_FtMaso, street segement maintained by the National Park Service within Fort Mason. Parks_NPS_Presid, street segement maintained by the National Park Service within the Presidio. Private, street segment is not maintained by the City and is not on an Assessor or Public Works map. Private_parking, street segment is not maintained by the City and is not on an Assessor or Public Works map, and is a parking lot. PSEUDO, street segment created for use in addressing. Streets, standard street centerline segement. Streets_HuntersP, standard street centerline segement within the Hunters Point Shipyard area. Streets_Pedestri, standard street centerline segement, but pedestrian access only. Streets_TI, standard street centerline segement within Treasure Island. Streets_YBI, standard street centerline segement within Yerba Buena Island. UPROW, Unpaved Right of Way street centerline segment.
    
    
     nhood
     SFRealtor-defined neighborhood that the segment is primarily intersects
    
    
     oneway
     Indicates if street segment is a one way street: possible values are F (the segment is one way beginning at the "from" street) , T (the segment is one way beginning at the "to" street), or B (traffic is legal in "both" directions)
    
    
     street_gc
     Street name without street type, with the numbered streets with leading zeroes dropped to facilitate geocoding
    
    
     streetname
     Full street name and street type
    
    
     streetname_gc
     Full street name and street type, with the numbered streets with leading zeroes dropped to facilitate geocoding
    
    
     zip_code
     ZIP Code that street segment falls in.
    
    
     analysis_neighborhood
     current analysis neighborhood
    
    
     supervisor_district
     current supervisor district
    
    
     line
     Geometry
    
    
     data_as_of
     Timestamp the data was updated in the source system
    
    
     data_loaded_at
     Timestamp the data was loaded to the open data portal
    

    Note: If no description was provided by DataSF, the cell is left blank. See the source data for more information.

  • SWOT Level 2 Lake Single-Pass Vector Data Product, Version C - Dataset -...

    • data.nasa.gov
    Updated Apr 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). SWOT Level 2 Lake Single-Pass Vector Data Product, Version C - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/swot-level-2-lake-single-pass-vector-data-product-version-c
    Explore at:
    Dataset updated
    Apr 1, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    The SWOT Level 2 Lake Single-Pass Vector Data Product from the Surface Water Ocean Topography (SWOT) mission provides water surface elevation, area, storage change derived from the high rate (HR) data stream from the Ka-band Radar Interferometer (KaRIn). SWOT launched on December 16, 2022 from Vandenberg Air Force Base in California into a 1-day repeat orbit for the "calibration" or "fast-sampling" phase of the mission, which completed in early July 2023. After the calibration phase, SWOT entered a 21-day repeat orbit in August 2023 to start the "science" phase of the mission, which is expected to continue through 2025. Water surface elevation, area, and storage change are provided in three feature datasets covering the full swath for each continent-pass: 1) an observation-oriented feature dataset of lakes identified in the prior lake database (PLD), 2) a PLD-oriented feature dataset of lakes identified in the PLD, and 3) a feature dataset containing unassigned features (i.e., not identified in PLD nor prior river database (PRD)). These data are generally produced for inland and coastal hydrology surfaces, as controlled by the reloadable KaRIn HR mask. The dataset is distributed in ESRI Shapefile format.Please note that this collection contains SWOT Version C science data products.This dataset is the parent collection to the following sub-collections: https://podaac.jpl.nasa.gov/dataset/SWOT_L2_HR_LakeSP_obs_2.0 https://podaac.jpl.nasa.gov/dataset/SWOT_L2_HR_LakeSP_prior_2.0 https://podaac.jpl.nasa.gov/dataset/SWOT_L2_HR_LakeSP_unassigned_2.0

  • Student Performance Dataset

    • kaggle.com
    Updated Aug 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ghulam Muhammad Nabeel (2025). Student Performance Dataset [Dataset]. https://www.kaggle.com/datasets/nabeelqureshitiii/student-performance-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 27, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ghulam Muhammad Nabeel
    Description

    📊 Student Performance Dataset (Synthetic, Realistic)

    Overview

    This dataset contains 1000000 rows of realistic student performance data, designed for beginners in Machine Learning to practice Linear Regression, model training, and evaluation techniques.

    Each row represents one student with features like study hours, attendance, class participation, and final score.
    The dataset is small, clean, and structured to be beginner-friendly.

    🔑 Columns Description

    • student_id → Unique identifier for each student.
    • weekly_self_study_hours → Average weekly self-study hours (0–40). Generated using a normal distribution centered around 15 hours.
    • attendance_percentage → Attendance percentage (50–100). Simulated with a normal distribution around 85%.
    • class_participation → Score between 0–10 indicating how actively the student participates in class. Generated from a normal distribution centered around 6.
    • total_score → Final performance score (0–100). Calculated as a function of study hours + random noise, then clipped between 0–100. Stronger correlation with study hours.
    • grade → Categorical label (A, B, C, D, F) derived from total_score.

    📐 Data Generation Logic

    1. Weekly Study Hours: Modeled using a normal distribution (mean ≈ 15, std ≈ 7), capped between 0 and 40 hours.
    2. Scores: More study hours → higher score. Formula:

    Random noise simulates differences in learning ability, motivation, etc.

    1. Attendance & Participation: Independent but realistic variations added.
    2. Grades: Assigned from scores using thresholds:
    • A: ≥ 85
    • B: ≥ 70
    • C: ≥ 55
    • D: ≥ 40
    • F: < 40

    🎯 How to Use This Dataset

    Regression Tasks

    • Predict total_score from weekly_self_study_hours.
    • Train and evaluate Linear Regression models.
    • Extend to multiple regression using attendance_percentage and class_participation.

    Classification Tasks

    • Predict grade (A–F) using study hours, attendance, and participation.

    Model Evaluation Practice

    • Apply train-test split and cross-validation.
    • Evaluate with MAE, RMSE, R².
    • Compare simple vs. multiple regression.

    ✅ This dataset is intentionally kept simple, so that new ML learners can clearly see the relationship between input features (study, attendance, participation) and output (score/grade).

  • c

    Data from: Temporary Street Closures

    • s.cnmilf.com
    • data.sfgov.org
    • +1more
    Updated Oct 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.sfgov.org (2025). Temporary Street Closures [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/temporary-street-closures
    Explore at:
    Dataset updated
    Oct 18, 2025
    Dataset provided by
    data.sfgov.org
    Description

    A. SUMMARY This dataset stores upcoming and current street closures occurring as a result of the Shared Spaces program, certain special events, and some construction work. This dataset only includes street closures permitted by the San Francisco Municipal Transportation Agency (SFMTA). It doesn’t include street closures managed by other City departments such as Public Works or the Police Department. B. HOW THE DATASET IS CREATED The data is exported from the Street Closure Salesforce database which is maintained by the SFMTA ISCOTT unit and converted to the geometry of streets and intersections based on closed street extent descriptions. C. UPDATE PROCESS Database is updated constantly. This report is issued daily. D. HOW TO USE THIS DATASET Please be aware that this dataset only contains temporary street closure events that are permitted “status = permitted” This dataset contains various types of street closures. If you are looking for a particular type please use the type field to select the appropriate closure type (Ex. type = Shared Space). E. RELATED DATASETS For intersection points affected by temporary street closures (rather than the street segments provided here), please see the Temporary Street Closure Intersections dataset. For street closures in the WZDx standard format, see the Temporary Street Closures in the Work Zone Data Exchange (WZDx) Format dataset.

  • Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.sfgov.org (2025). Street Sweeping Schedule [Dataset]. https://catalog.data.gov/dataset/street-sweeping-schedule

    Street Sweeping Schedule

    Explore at:
    57 scholarly articles cite this dataset (View in Google Scholar)
    Dataset updated
    Oct 4, 2025
    Dataset provided by
    data.sfgov.org
    Description

    A. SUMMARY Mechanical street sweeping and street cleaning schedule managed by San Francisco Public Works. B. HOW THE DATASET IS CREATED This dataset is created by extracting all street sweeping schedule data from a Department of Public Works database, it is then geocoded to add common identifiers such as Centerline Network Number ("CNN") then published to the open data portal. C. UPDATE PROCESS This dataset will be updated on an 'as needed' basis, when sweeping schedules change. D. HOW TO USE THIS DATASET Use this dataset to understand, track, or analyze street sweeping in San Francisco.

    Search
    Clear search
    Close search
    Google apps
    Main menu