5 datasets found
  1. Z

    Geographic Diversity in Public Code Contributions — Replication Package

    • data.niaid.nih.gov
    Updated Mar 31, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Davide Rossi; Stefano Zacchiroli (2022). Geographic Diversity in Public Code Contributions — Replication Package [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6390354
    Explore at:
    Dataset updated
    Mar 31, 2022
    Dataset provided by
    University of Bologna, Italy
    LTCI, Télécom Paris, Institut Polytechnique de Paris
    Authors
    Davide Rossi; Stefano Zacchiroli
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Geographic Diversity in Public Code Contributions - Replication Package

    This document describes how to replicate the findings of the paper: Davide Rossi and Stefano Zacchiroli, 2022, Geographic Diversity in Public Code Contributions - An Exploratory Large-Scale Study Over 50 Years. In 19th International Conference on Mining Software Repositories (MSR ’22), May 23-24, Pittsburgh, PA, USA. ACM, New York, NY, USA, 5 pages. https://doi.org/10.1145/3524842.3528471

    This document comes with the software needed to mine and analyze the data presented in the paper.

    Prerequisites

    These instructions assume the use of the bash shell, the Python programming language, the PosgreSQL DBMS (version 11 or later), the zstd compression utility and various usual *nix shell utilities (cat, pv, …), all of which are available for multiple architectures and OSs. It is advisable to create a Python virtual environment and install the following PyPI packages:

    click==8.0.4 cycler==0.11.0 fonttools==4.31.2 kiwisolver==1.4.0 matplotlib==3.5.1 numpy==1.22.3 packaging==21.3 pandas==1.4.1 patsy==0.5.2 Pillow==9.0.1 pyparsing==3.0.7 python-dateutil==2.8.2 pytz==2022.1 scipy==1.8.0 six==1.16.0 statsmodels==0.13.2

    Initial data

    swh-replica, a PostgreSQL database containing a copy of Software Heritage data. The schema for the database is available at https://forge.softwareheritage.org/source/swh-storage/browse/master/swh/storage/sql/. We retrieved these data from Software Heritage, in collaboration with the archive operators, taking an archive snapshot as of 2021-07-07. We cannot make these data available in full as part of the replication package due to both its volume and the presence in it of personal information such as user email addresses. However, equivalent data (stripped of email addresses) can be obtained from the Software Heritage archive dataset, as documented in the article: Antoine Pietri, Diomidis Spinellis, Stefano Zacchiroli, The Software Heritage Graph Dataset: Public software development under one roof. In proceedings of MSR 2019: The 16th International Conference on Mining Software Repositories, May 2019, Montreal, Canada. Pages 138-142, IEEE 2019. http://dx.doi.org/10.1109/MSR.2019.00030. Once retrieved, the data can be loaded in PostgreSQL to populate swh-replica.

    names.tab - forenames and surnames per country with their frequency

    zones.acc.tab - countries/territories, timezones, population and world zones

    c_c.tab - ccTDL entities - world zones matches

    Data preparation

    Export data from the swh-replica database to create commits.csv.zst and authors.csv.zst

    sh> ./export.sh

    Run the authors cleanup script to create authors--clean.csv.zst

    sh> ./cleanup.sh authors.csv.zst

    Filter out implausible names and create authors--plausible.csv.zst

    sh> pv authors--clean.csv.zst | unzstd | ./filter_names.py 2> authors--plausible.csv.log | zstdmt > authors--plausible.csv.zst

    Zone detection by email

    Run the email detection script to create author-country-by-email.tab.zst

    sh> pv authors--plausible.csv.zst | zstdcat | ./guess_country_by_email.py -f 3 2> author-country-by-email.csv.log | zstdmt > author-country-by-email.tab.zst

    Database creation and initial data ingestion

    Create the PostgreSQL DB

    sh> createdb zones-commit

    Notice that from now on when prepending the psql> prompt we assume the execution of psql on the zones-commit database.

    Import data into PostgreSQL DB

    sh> ./import_data.sh

    Zone detection by name

    Extract commits data from the DB and create commits.tab, that is used as input for the zone detection script

    sh> psql -f extract_commits.sql zones-commit

    Run the world zone detection script to create commit_zones.tab.zst

    sh> pv commits.tab | ./assign_world_zone.py -a -n names.tab -p zones.acc.tab -x -w 8 | zstdmt > commit_zones.tab.zst Use ./assign_world_zone.py --help if you are interested in changing the script parameters.

    Ingest zones assignment data into the DB

    psql> \copy commit_zone from program 'zstdcat commit_zones.tab.zst | cut -f1,6 | grep -Ev ''\s$'''

    Extraction and graphs

    Run the script to execute the queries to extract the data to plot from the DB. This creates commit_zones_7120.tab, author_zones_7120_t5.tab, commit_zones_7120.grid and author_zones_7120_t5.grid. Edit extract_data.sql if you whish to modify extraction parameters (start/end year, sampling, …).

    sh> ./extract_data.sh

    Run the script to create the graphs from all the previously extracted tabfiles.

    sh> ./create_stackedbar_chart.py -w 20 -s 1971 -f commit_zones_7120.grid -f author_zones_7120_t5.grid -o chart.pdf

  2. Z

    Worldwide Gender Differences in Public Code Contributions - Replication...

    • data.niaid.nih.gov
    Updated Feb 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Davide Rossi; Stefano Zacchiroli (2022). Worldwide Gender Differences in Public Code Contributions - Replication Package [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6020474
    Explore at:
    Dataset updated
    Feb 9, 2022
    Dataset provided by
    LTCI, Télécom Paris, Institut Polytechnique de Paris, France
    University of Bologna, Italy
    Authors
    Davide Rossi; Stefano Zacchiroli
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Worldwide Gender Differences in Public Code Contributions - Replication Package

    This document describes how to replicate the findings of the paper: Davide Rossi and Stefano Zacchiroli, 2022, Worldwide Gender Differences in Public Code Contributions. In Software Engineering in Society (ICSE-SEIS'22), May 21-29, 2022, Pittsburgh, PA, USA. ACM, New York, NY, USA, 12 pages. https://doi.org/10.1145/3510458.3513011

    This document comes with the software needed to mine and analyze the data presented in the paper.

    Prerequisites

    These instructions assume the use of the bash shell, the Python programming language, the PosgreSQL DBMS (version 11 or later), the zstd compression utility and various usual *nix shell utilities (cat, pv, ...), all of which are available for multiple architectures and OSs. It is advisable to create a Python virtual environment and install the following PyPI packages: click==8.0.3 cycler==0.10.0 gender-guesser==0.4.0 kiwisolver==1.3.2 matplotlib==3.4.3 numpy==1.21.3 pandas==1.3.4 patsy==0.5.2 Pillow==8.4.0 pyparsing==2.4.7 python-dateutil==2.8.2 pytz==2021.3 scipy==1.7.1 six==1.16.0 statsmodels==0.13.0

    Initial data

    swh-replica, a PostgreSQL database containing a copy of Software Heritage data. The schema for the database is available at https://forge.softwareheritage.org/source/swh-storage/browse/master/swh/storage/sql/. We retrieved these data from Software Heritage, in collaboration with the archive operators, taking an archive snapshot as of 2021-07-07. We cannot make these data available in full as part of the replication package due to both its volume and the presence in it of personal information such as user email addresses. However, equivalent data (stripped of email addresses) can be obtained from the Software Heritage archive dataset, as documented in the article: Antoine Pietri, Diomidis Spinellis, Stefano Zacchiroli, The Software Heritage Graph Dataset: Public software development under one roof. In proceedings of MSR 2019: The 16th International Conference on Mining Software Repositories, May 2019, Montreal, Canada. Pages 138-142, IEEE 2019. http://dx.doi.org/10.1109/MSR.2019.00030. Once retrieved, the data can be loaded in PostgreSQL to populate swh-replica.

    names.tab - forenames and surnames per country with their frequency

    zones.acc.tab - countries/territories, timezones, population and world zones

    c_c.tab - ccTDL entities - world zones matches

    Data preparation

    Export data from the swh-replica database to create commits.csv.zst and authors.csv.zst sh> ./export.sh

    Run the authors cleanup script to create authors--clean.csv.zst sh> ./cleanup.sh authors.csv.zst

    Filter out implausible names and create authors--plausible.csv.zst sh> pv authors--clean.csv.zst | unzstd | ./filter_names.py 2> authors--plausible.csv.log | zstdmt > authors--plausible.csv.zst

    Gender detection

    Run the gender guessing script to create author-fullnames-gender.csv.zst sh> pv authors--plausible.csv.zst | unzstd | ./guess_gender.py --fullname --field 2 | zstdmt > author-fullnames-gender.csv.zst

    Database creation and data ingestion

    Create the PostgreSQL DB sh> createdb gender-commit Notice that from now on when prepending the psql> prompt we assume the execution of psql on the gender-commit database.

    Import data into PostgreSQL DB sh> ./import_data.sh

    Zone detection

    Extract commits data from the DB and create commits.tab, that is used as input for the gender detection script sh> psql -f extract_commits.sql gender-commit

    Run the world zone detection script to create commit_zones.tab.zst sh> pv commits.tab | ./assign_world_zone.py -a -n names.tab -p zones.acc.tab -x -w 8 | zstdmt > commit_zones.tab.zst Use ./assign_world_zone.py --help if you are interested in changing the script parameters.

    Read zones assignment data from the file into the DB psql> \copy commit_culture from program 'zstdcat commit_zones.tab.zst | cut -f1,6 | grep -Ev ''\s$'''

    Extraction and graphs

    Run the script to execute the queries to extract the data to plot from the DB. This creates commits_tz.tab, authors_tz.tab, commits_zones.tab, authors_zones.tab, and authors_zones_1620.tab. Edit extract_data.sql if you whish to modify extraction parameters (start/end year, sampling, ...). sh> ./extract_data.sh

    Run the script to create the graphs from all the previously extracted tabfiles. This will generate commits_tzs.pdf, authors_tzs.pdf, commits_zones.pdf, authors_zones.pdf, and authors_zones_1620.pdf. sh> ./create_charts.sh

    Additional graphs

    This package also includes some already-made graphs

    authors_zones_1.pdf: stacked graphs showing the ratio of female authors per world zone through the years, considering all authors with at least one commit per period

    authors_zones_2.pdf: ditto with at least two commits per period

    authors_zones_10.pdf: ditto with at least ten commits per period

  3. Z

    Cluster-based gene detection of mitochondrial genomes using de-Bruijn graphs...

    • datasetcatalog.nlm.nih.gov
    Updated May 3, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Middendorf, Martin; Fiedler, Lisa; Bernt, Matthias (2021). Cluster-based gene detection of mitochondrial genomes using de-Bruijn graphs [Dataset]. http://doi.org/10.5281/zenodo.4632893
    Explore at:
    Dataset updated
    May 3, 2021
    Authors
    Middendorf, Martin; Fiedler, Lisa; Bernt, Matthias
    Description

    Database and result data sets for MDBG Annotation. Contains: - Postgresql database copy (db) - CSV for gapStatistic - CSV for jaccardStatistic Source code is available at:

  4. MedSynora DW - Medical Data Warehouse

    • kaggle.com
    zip
    Updated Mar 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BenMebrar (2025). MedSynora DW - Medical Data Warehouse [Dataset]. https://www.kaggle.com/datasets/mebrar21/medsynora-dw
    Explore at:
    zip(89253728 bytes)Available download formats
    Dataset updated
    Mar 14, 2025
    Authors
    BenMebrar
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    MedSynora DW – A Comprehensive Synthetic Hospital Patient Data Warehouse

    Overview MedSynora DW is a huge synthetic dataset designed to simulate the operation flow by adopting a patient-based approach in a large hospital. This dataset covers patient encounters, treatments, lab tests, vital signs, cost details and more over a full year of 2024. It is developed to support data science, machine learning, and business intelligence projects in the healthcare domain.

    Project Highlights • Realistic Simulation: Generated using advanced Python scripts and statistical models, the dataset reflects realistic hospital operations and patient flows without using any real patient data. • Comprehensive Schema: The data warehouse includes multiple fact and dimension tables: o Fact Tables: Encounter, Treatment, Lab Tests, Special Tests, Vitals, and Cost. o Dimension Tables: Patient, Doctor, Disease, Insurance, Room, Date, Chronic Diseases, Allergies, and Additional Services. o Bridge Tables: For managing many-to-many relationships (e.g., doctors per encounter) and some other… • Synthetic & Scalable: The dataset is entirely synthetic, ensuring privacy and compliance. It is designed to be scalable – the current version simulates around 145,000 encounter records.

    Data Generation • Data Sources & Methods: Data is generated using bunch of Py libraries. Highly customized algorithms simulate realistic patient demographics, doctor assignments, treatment choices, lab test results, and cost breakdowns etc.. • Diverse Scenarios: With over 300 diseases and thousands of treatment variations, along with dozens of lab and special tests, the dataset offers profoundly rich variability to support complex analytical projects.

    How to Use This Dataset • For Data Modeling & ETL Testing: Import the CSV files into your favorite database system (e.g., PostgreSQL, MySQL, or directly into a BI tool like Power BI) and set up relationships as described in the accompanying documentation. • For Machine Learning Projects: Use the dataset to build predictive models related to patient outcomes, cost analysis, or treatment efficacy. • For Educational Purposes: Ideal for learning about data warehousing, star schema design, and advanced analytics in healthcare.

    Final Note MedSynora DW offers a unique opportunity to experiment with a comprehensive, realistic hospital data warehouse without compromising real patient information. Enjoy exploring, analyzing, and building with this dataset – and feel free to reach out if you have any questions or suggestions. In particular, inconsistencies, deficiencies or suggestions about the dataset by experts in the field will contribute to other version improvements.

  5. Z

    Worldwide Gender Differences in Public Code Contributions - Replication...

    • data-staging.niaid.nih.gov
    Updated Feb 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The citation is currently not available for this dataset.
    Explore at:
    Dataset updated
    Feb 9, 2022
    Dataset provided by
    LTCI, Télécom Paris, Institut Polytechnique de Paris, France
    University of Bologna, Italy
    Authors
    Davide Rossi; Stefano Zacchiroli
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Worldwide Gender Differences in Public Code Contributions - Replication Package

    This document describes how to replicate the findings of the paper: Davide Rossi and Stefano Zacchiroli, 2022, Worldwide Gender Differences in Public Code Contributions. In Software Engineering in Society (ICSE-SEIS'22), May 21-29, 2022, Pittsburgh, PA, USA. ACM, New York, NY, USA, 12 pages. https://doi.org/10.1145/3510458.3513011

    This document comes with the software needed to mine and analyze the data presented in the paper.

    Prerequisites

    These instructions assume the use of the bash shell, the Python programming language, the PosgreSQL DBMS (version 11 or later), the zstd compression utility and various usual *nix shell utilities (cat, pv, ...), all of which are available for multiple architectures and OSs. It is advisable to create a Python virtual environment and install the following PyPI packages: click==8.0.3 cycler==0.10.0 gender-guesser==0.4.0 kiwisolver==1.3.2 matplotlib==3.4.3 numpy==1.21.3 pandas==1.3.4 patsy==0.5.2 Pillow==8.4.0 pyparsing==2.4.7 python-dateutil==2.8.2 pytz==2021.3 scipy==1.7.1 six==1.16.0 statsmodels==0.13.0

    Initial data

    swh-replica, a PostgreSQL database containing a copy of Software Heritage data. The schema for the database is available at https://forge.softwareheritage.org/source/swh-storage/browse/master/swh/storage/sql/. We retrieved these data from Software Heritage, in collaboration with the archive operators, taking an archive snapshot as of 2021-07-07. We cannot make these data available in full as part of the replication package due to both its volume and the presence in it of personal information such as user email addresses. However, equivalent data (stripped of email addresses) can be obtained from the Software Heritage archive dataset, as documented in the article: Antoine Pietri, Diomidis Spinellis, Stefano Zacchiroli, The Software Heritage Graph Dataset: Public software development under one roof. In proceedings of MSR 2019: The 16th International Conference on Mining Software Repositories, May 2019, Montreal, Canada. Pages 138-142, IEEE 2019. http://dx.doi.org/10.1109/MSR.2019.00030. Once retrieved, the data can be loaded in PostgreSQL to populate swh-replica.

    names.tab - forenames and surnames per country with their frequency

    zones.acc.tab - countries/territories, timezones, population and world zones

    c_c.tab - ccTDL entities - world zones matches

    Data preparation

    Export data from the swh-replica database to create commits.csv.zst and authors.csv.zst sh> ./export.sh

    Run the authors cleanup script to create authors--clean.csv.zst sh> ./cleanup.sh authors.csv.zst

    Filter out implausible names and create authors--plausible.csv.zst sh> pv authors--clean.csv.zst | unzstd | ./filter_names.py 2> authors--plausible.csv.log | zstdmt > authors--plausible.csv.zst

    Gender detection

    Run the gender guessing script to create author-fullnames-gender.csv.zst sh> pv authors--plausible.csv.zst | unzstd | ./guess_gender.py --fullname --field 2 | zstdmt > author-fullnames-gender.csv.zst

    Database creation and data ingestion

    Create the PostgreSQL DB sh> createdb gender-commit Notice that from now on when prepending the psql> prompt we assume the execution of psql on the gender-commit database.

    Import data into PostgreSQL DB sh> ./import_data.sh

    Zone detection

    Extract commits data from the DB and create commits.tab, that is used as input for the gender detection script sh> psql -f extract_commits.sql gender-commit

    Run the world zone detection script to create commit_zones.tab.zst sh> pv commits.tab | ./assign_world_zone.py -a -n names.tab -p zones.acc.tab -x -w 8 | zstdmt > commit_zones.tab.zst Use ./assign_world_zone.py --help if you are interested in changing the script parameters.

    Read zones assignment data from the file into the DB psql> \copy commit_culture from program 'zstdcat commit_zones.tab.zst | cut -f1,6 | grep -Ev ''\s$'''

    Extraction and graphs

    Run the script to execute the queries to extract the data to plot from the DB. This creates commits_tz.tab, authors_tz.tab, commits_zones.tab, authors_zones.tab, and authors_zones_1620.tab. Edit extract_data.sql if you whish to modify extraction parameters (start/end year, sampling, ...). sh> ./extract_data.sh

    Run the script to create the graphs from all the previously extracted tabfiles. This will generate commits_tzs.pdf, authors_tzs.pdf, commits_zones.pdf, authors_zones.pdf, and authors_zones_1620.pdf. sh> ./create_charts.sh

    Additional graphs

    This package also includes some already-made graphs

    authors_zones_1.pdf: stacked graphs showing the ratio of female authors per world zone through the years, considering all authors with at least one commit per period

    authors_zones_2.pdf: ditto with at least two commits per period

    authors_zones_10.pdf: ditto with at least ten commits per period

  6. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Davide Rossi; Stefano Zacchiroli (2022). Geographic Diversity in Public Code Contributions — Replication Package [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6390354

Geographic Diversity in Public Code Contributions — Replication Package

Explore at:
Dataset updated
Mar 31, 2022
Dataset provided by
University of Bologna, Italy
LTCI, Télécom Paris, Institut Polytechnique de Paris
Authors
Davide Rossi; Stefano Zacchiroli
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Geographic Diversity in Public Code Contributions - Replication Package

This document describes how to replicate the findings of the paper: Davide Rossi and Stefano Zacchiroli, 2022, Geographic Diversity in Public Code Contributions - An Exploratory Large-Scale Study Over 50 Years. In 19th International Conference on Mining Software Repositories (MSR ’22), May 23-24, Pittsburgh, PA, USA. ACM, New York, NY, USA, 5 pages. https://doi.org/10.1145/3524842.3528471

This document comes with the software needed to mine and analyze the data presented in the paper.

Prerequisites

These instructions assume the use of the bash shell, the Python programming language, the PosgreSQL DBMS (version 11 or later), the zstd compression utility and various usual *nix shell utilities (cat, pv, …), all of which are available for multiple architectures and OSs. It is advisable to create a Python virtual environment and install the following PyPI packages:

click==8.0.4 cycler==0.11.0 fonttools==4.31.2 kiwisolver==1.4.0 matplotlib==3.5.1 numpy==1.22.3 packaging==21.3 pandas==1.4.1 patsy==0.5.2 Pillow==9.0.1 pyparsing==3.0.7 python-dateutil==2.8.2 pytz==2022.1 scipy==1.8.0 six==1.16.0 statsmodels==0.13.2

Initial data

swh-replica, a PostgreSQL database containing a copy of Software Heritage data. The schema for the database is available at https://forge.softwareheritage.org/source/swh-storage/browse/master/swh/storage/sql/. We retrieved these data from Software Heritage, in collaboration with the archive operators, taking an archive snapshot as of 2021-07-07. We cannot make these data available in full as part of the replication package due to both its volume and the presence in it of personal information such as user email addresses. However, equivalent data (stripped of email addresses) can be obtained from the Software Heritage archive dataset, as documented in the article: Antoine Pietri, Diomidis Spinellis, Stefano Zacchiroli, The Software Heritage Graph Dataset: Public software development under one roof. In proceedings of MSR 2019: The 16th International Conference on Mining Software Repositories, May 2019, Montreal, Canada. Pages 138-142, IEEE 2019. http://dx.doi.org/10.1109/MSR.2019.00030. Once retrieved, the data can be loaded in PostgreSQL to populate swh-replica.

names.tab - forenames and surnames per country with their frequency

zones.acc.tab - countries/territories, timezones, population and world zones

c_c.tab - ccTDL entities - world zones matches

Data preparation

Export data from the swh-replica database to create commits.csv.zst and authors.csv.zst

sh> ./export.sh

Run the authors cleanup script to create authors--clean.csv.zst

sh> ./cleanup.sh authors.csv.zst

Filter out implausible names and create authors--plausible.csv.zst

sh> pv authors--clean.csv.zst | unzstd | ./filter_names.py 2> authors--plausible.csv.log | zstdmt > authors--plausible.csv.zst

Zone detection by email

Run the email detection script to create author-country-by-email.tab.zst

sh> pv authors--plausible.csv.zst | zstdcat | ./guess_country_by_email.py -f 3 2> author-country-by-email.csv.log | zstdmt > author-country-by-email.tab.zst

Database creation and initial data ingestion

Create the PostgreSQL DB

sh> createdb zones-commit

Notice that from now on when prepending the psql> prompt we assume the execution of psql on the zones-commit database.

Import data into PostgreSQL DB

sh> ./import_data.sh

Zone detection by name

Extract commits data from the DB and create commits.tab, that is used as input for the zone detection script

sh> psql -f extract_commits.sql zones-commit

Run the world zone detection script to create commit_zones.tab.zst

sh> pv commits.tab | ./assign_world_zone.py -a -n names.tab -p zones.acc.tab -x -w 8 | zstdmt > commit_zones.tab.zst Use ./assign_world_zone.py --help if you are interested in changing the script parameters.

Ingest zones assignment data into the DB

psql> \copy commit_zone from program 'zstdcat commit_zones.tab.zst | cut -f1,6 | grep -Ev ''\s$'''

Extraction and graphs

Run the script to execute the queries to extract the data to plot from the DB. This creates commit_zones_7120.tab, author_zones_7120_t5.tab, commit_zones_7120.grid and author_zones_7120_t5.grid. Edit extract_data.sql if you whish to modify extraction parameters (start/end year, sampling, …).

sh> ./extract_data.sh

Run the script to create the graphs from all the previously extracted tabfiles.

sh> ./create_stackedbar_chart.py -w 20 -s 1971 -f commit_zones_7120.grid -f author_zones_7120_t5.grid -o chart.pdf

Search
Clear search
Close search
Google apps
Main menu