6 datasets found

o
Career promotions, research publications, Open Access dataset
ordo.open.ac.uk
zip
Updated Feb 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matteo Cancellieri; Nancy Pontika; David Pride; Petr Knoth; Hannah Metzler; Antonia Correia; Helene Brinken; Bikash Gyawali (2022). Career promotions, research publications, Open Access dataset [Dataset]. http://doi.org/10.21954/ou.rd.19228785.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.21954/ou.rd.19228785.v1
Dataset updated
Feb 28, 2022
Dataset provided by
The Open University
Authors
Matteo Cancellieri; Nancy Pontika; David Pride; Petr Knoth; Hannah Metzler; Antonia Correia; Helene Brinken; Bikash Gyawali
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is a compilation of processed data on citation and references for research papers including their author, institution and open access info for a selected sample of academics analysed using Microsoft Academic Graph (MAG) data and CORE. The data for this dataset was collected during December 2019 to January 2020.Six countries (Austria, Brazil, Germany, India, Portugal, United Kingdom and United States) were the focus of the six questions which make up this dataset. There is one csv file per country and per question (36 files in total). More details about the creation of this dataset are available on the public ON-MERRIT D3.1 deliverable report.The dataset is a combination of two different data sources, one part is a dataset created on analysing promotion policies across the target countries, while the second part is a set of data points available to understand the publishing behaviour. To facilitate the analysis the dataset is organised in the following seven folders:PRTThe dataset with the file name "PRT_policies.csv" contains the related information as this was extracted from promotion, review and tenure (PRT) policies. Q1: What % of papers coming from a university are Open Access?- Dataset Name format: oa_status_countryname_papers.csv- Dataset Contents: Open Access (OA) status of all papers of all the universities listed in Times Higher Education World University Rankings (THEWUR) for the given country. A paper is marked OA if there is at least an OA link available. OA links are collected using the CORE Discovery API.- Important considerations about this dataset: - Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to. - The service we used to recognise if a paper is OA, CORE Discovery, does not contain entries for all paperids in MAG. This implies that some of the records in the dataset extracted will not have either a true or false value for the _is_OA_ field. - Only those records marked as true for _is_OA_ field can be said to be OA. Others with false or no value for is_OA field are unknown status (i.e. not necessarily closed access).Q2: How are papers, published by the selected universities, distributed across the three scientific disciplines of our choice?- Dataset Name format: fsid_countryname_papers.csv- Dataset Contents: For the given country, all papers for all the universities listed in THEWUR with the information of fieldofstudy they belong to.- Important considerations about this dataset: * MAG can associate a paper to multiple fieldofstudyid. If a paper belongs to more than one of our fieldofstudyid, separate records were created for the paper with each of those _fieldofstudyid_s.- MAG assigns fieldofstudyid to every paper with a score. We preserve only those records whose score is more than 0.5 for any fieldofstudyid it belongs to.- Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to. Papers with authorship from multiple universities are counted once towards each of the universities concerned.Q3: What is the gender distribution in authorship of papers published by the universities?- Dataset Name format: author_gender_countryname_papers.csv- Dataset Contents: All papers with their author names for all the universities listed in THEWUR.- Important considerations about this dataset :- When there are multiple collaborators(authors) for the same paper, this dataset makes sure that only the records for collaborators from within selected universities are preserved.- An external script was executed to determine the gender of the authors. The script is available here.Q4: Distribution of staff seniority (= number of years from their first publication until the last publication) in the given university.- Dataset Name format: author_ids_countryname_papers.csv- Dataset Contents: For a given country, all papers for authors with their publication year for all the universities listed in THEWUR.- Important considerations about this work :- When there are multiple collaborators(authors) for the same paper, this dataset makes sure that only the records for collaborators from within selected universities are preserved.- Calculating staff seniority can be achieved in various ways. The most straightforward option is to calculate it as _academic_age = MAX(year) - MIN(year) _for each authorid.Q5: Citation counts (incoming) for OA vs Non-OA papers published by the university.- Dataset Name format: cc_oa_countryname_papers.csv- Dataset Contents: OA status and OA links for all papers of all the universities listed in THEWUR and for each of those papers, count of incoming citations available in MAG.- Important considerations about this dataset :- CORE Discovery was used to establish the OA status of papers.- Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to.- Only those records marked as true for _is_OA_ field can be said to be OA. Others with false or no value for is_OA field are unknown status (i.e. not necessarily closed access).Q6: Count of OA vs Non-OA references (outgoing) for all papers published by universities.- Dataset Name format: rc_oa_countryname_-papers.csv- Dataset Contents: Counts of all OA and unknown papers referenced by all papers published by all the universities listed in THEWUR.- Important considerations about this dataset :- CORE Discovery was used to establish the OA status of papers being referenced.- Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to. Papers with authorship from multiple universities are counted once towards each of the universities concerned.Additional files:- _fieldsofstudy_mag_.csv: this file contains a dump of fieldsofstudy table of MAG mapping each of the ids to their actual field of study name.
Success.ai | LinkedIn Data | 700M Public Profiles & 70M Companies – Best...
datarade.ai
Updated Jan 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Success.ai (2022). Success.ai | LinkedIn Data | 700M Public Profiles & 70M Companies – Best Price Guarantee [Dataset]. https://datarade.ai/data-products/success-ai-linkedin-data-700m-public-profiles-70m-compa-success-ai-294c
Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Jan 1, 2022
Dataset provided by
Area covered
Austria, Luxembourg, Singapore, Montserrat, Mauritius, Saudi Arabia, Greenland, Estonia, Virgin Islands (British), Mayotte
Description
Success.ai’s LinkedIn Data Solutions offer unparalleled access to a vast dataset of 700 million public LinkedIn profiles and 70 million LinkedIn company records, making it one of the most comprehensive and reliable LinkedIn datasets available on the market today. Our employee data and LinkedIn data are ideal for businesses looking to streamline recruitment efforts, build highly targeted lead lists, or develop personalized B2B marketing campaigns.

Whether you’re looking for recruiting data, conducting investment research, or seeking to enrich your CRM systems with accurate and up-to-date LinkedIn profile data, Success.ai provides everything you need with pinpoint precision. By tapping into LinkedIn company data, you’ll have access to over 40 critical data points per profile, including education, professional history, and skills.

Key Benefits of Success.ai’s LinkedIn Data: Our LinkedIn data solution offers more than just a dataset. With GDPR-compliant data, AI-enhanced accuracy, and a price match guarantee, Success.ai ensures you receive the highest-quality data at the best price in the market. Our datasets are delivered in Parquet format for easy integration into your systems, and with millions of profiles updated daily, you can trust that you’re always working with fresh, relevant data.

Global Reach and Industry Coverage: Our LinkedIn data covers professionals across all industries and sectors, providing you with detailed insights into businesses around the world. Our geographic coverage spans 259M profiles in the United States, 22M in the United Kingdom, 27M in India, and thousands of profiles in regions such as Europe, Latin America, and Asia Pacific. With LinkedIn company data, you can access profiles of top companies from the United States (6M+), United Kingdom (2M+), and beyond, helping you scale your outreach globally.

Why Choose Success.ai’s LinkedIn Data: Success.ai stands out for its tailored approach and white-glove service, making it easy for businesses to receive exactly the data they need without managing complex data platforms. Our dedicated Success Managers will curate and deliver your dataset based on your specific requirements, so you can focus on what matters most—reaching the right audience. Whether you’re sourcing employee data, LinkedIn profile data, or recruiting data, our service ensures a seamless experience with 99% data accuracy.

Best Price Guarantee: We offer unbeatable pricing on LinkedIn data, and we’ll match any competitor.

Global Scale: Access 700 million LinkedIn profiles and 70 million company records globally.

AI-Verified Accuracy: Enjoy 99% data accuracy through our advanced AI and manual validation processes.

Real-Time Data: Profiles are updated daily, ensuring you always have the most relevant insights.

Tailored Solutions: Get custom-curated LinkedIn data delivered directly, without managing platforms.

Ethically Sourced Data: Compliant with global privacy laws, ensuring responsible data usage.

Comprehensive Profiles: Over 40 data points per profile, including job titles, skills, and company details.

Wide Industry Coverage: Covering sectors from tech to finance across regions like the US, UK, Europe, and Asia.

Key Use Cases:

Sales Prospecting and Lead Generation: Build targeted lead lists using LinkedIn company data and professional profiles, helping sales teams engage decision-makers at high-value accounts.

Recruitment and Talent Sourcing: Use LinkedIn profile data to identify and reach top candidates globally. Our employee data includes work history, skills, and education, providing all the details you need for successful recruitment.

Account-Based Marketing (ABM): Use our LinkedIn company data to tailor marketing campaigns to key accounts, making your outreach efforts more personalized and effective.

Investment Research & Due Diligence: Identify companies with strong growth potential using LinkedIn company data. Access key data points such as funding history, employee count, and company trends to fuel investment decisions.

Competitor Analysis: Stay ahead of your competition by tracking hiring trends, employee movement, and company growth through LinkedIn data. Use these insights to adjust your market strategy and improve your competitive positioning.

CRM Data Enrichment: Enhance your CRM systems with real-time updates from Success.ai’s LinkedIn data, ensuring that your sales and marketing teams are always working with accurate and up-to-date information.

Comprehensive Data Points for LinkedIn Profiles: Our LinkedIn profile data includes over 40 key data points for every individual and company, ensuring a complete understanding of each contact:

LinkedIn URL: Access direct links to LinkedIn profiles for immediate insights. Full Name: Verified first and last names. Job Title: Current job titles, and prior experience. Company Information: Company name, LinkedIn URL, domain, and location. Work and Per...
Generative AI In Coding Market Analysis, Size, and Forecast 2025-2029: North...
technavio.com
pdf
Updated Jul 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Technavio (2025). Generative AI In Coding Market Analysis, Size, and Forecast 2025-2029: North America (US, Canada, and Mexico), Europe (France, Germany, and UK), APAC (China, India, Japan, and South Korea), and Rest of World (ROW) [Dataset]. https://www.technavio.com/report/generative-ai-in-coding-market-industry-analysis
Explore at:
pdfAvailable download formats
Dataset updated
Jul 26, 2025
Dataset provided by
TechNavio
Authors
Technavio
License
https://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice
Time period covered
2025 - 2029
Area covered
United States
Description
Snapshot img

Generative AI In Coding Market Size 2025-2029

The generative AI in coding market size is forecast to increase by USD 10.22 billion, at a CAGR of 32.7% between 2024 and 2029.

The market is experiencing significant growth, driven by the increasing demand for increased developer productivity and accelerated innovation cycles. Companies are recognizing the potential of generative AI to automate coding tasks, reducing the time and effort required for software development. However, this shift towards AI-driven coding is not without challenges. Navigating concerns of security, accuracy, and intellectual property are key obstacles in the adoption of generative AI in coding. Ensuring the security of code generated by AI is essential, as any vulnerabilities could lead to significant risks. Semantic reasoning and predictive analytics are transforming decision making, while AI-powered chatbots and virtual assistants enhance customer service. Lastly, addressing intellectual property concerns is necessary to ensure ownership and control over the generated code. As the market continues to evolve, companies must adapt to these challenges and focus on integrating generative AI into enterprise platforms rather than relying on individual tools. By doing so, they can mitigate risks, improve efficiency, and drive innovation in their software development processes. Overall, the market presents significant opportunities for businesses seeking to streamline their development processes and stay competitive in the rapidly evolving tech landscape. Real-time anomaly detection and latency reduction techniques are critical for maintaining the reliability and accuracy of these systems.

What will be the Size of the Generative AI In Coding Market during the forecast period?

Explore in-depth regional segment analysis with market size data - historical 2019-2023 and forecasts 2025-2029 - in the full report.
Request Free Sample

The market for generative AI in coding continues to evolve, with applications spanning various sectors including finance, healthcare, and manufacturing. Deployment scalability and model performance benchmarking are critical factors as organizations seek to optimize their AI models. Training dataset size plays a significant role in model accuracy, with larger datasets often leading to improved results. Ethical AI considerations, such as model explainability and fairness metrics, are increasingly important as AI becomes more prevalent in business operations. One example of the market's dynamic nature can be seen in the use of code readability assessment and accuracy measurements in software development. Model bias, data privacy, and data security remain critical concerns.

By analyzing code complexity and vulnerability detection, organizations can improve code quality and reduce the risk of security flaws. Neural network training and model fine-tuning are ongoing processes, with AI models requiring continuous updates to maintain optimal performance. According to recent industry reports, the generative AI market in coding is expected to grow by over 25% annually in the coming years, driven by advancements in explainable AI, bias mitigation strategies, and the increasing demand for more efficient and accurate coding solutions. Additionally, techniques such as data augmentation, AUC calculation, and ROC curve analysis are becoming increasingly important for improving model performance and reducing the need for large training datasets.

How is this Generative AI In Coding Market segmented?

The generative AI in coding market research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.

Application Code generation Code enhancement Language translation Code reviews End-user Data science and analytics Web and application development Game development and design IoT and smart devices Others Type Python JavaScript Java Others Geography North America US Canada Mexico Europe France Germany UK APAC China India Japan South Korea Rest of World (ROW)

By Application Insights

The Code generation segment is estimated to witness significant growth during the forecast period. The market is witnessing significant advancements in automating software development processes. Code generation AI, a key segment, automates the creation of new source code from user inputs, addressing the time-consuming aspect of writing boilerplate or repetitive code. This technology has evolved from simple code completions to generating complex functions, classes, and even entire application scaffolds. Integration with version control systems and IDEs, such as GitHub Copilot, enhances developer productivity. Program synthesis
#IndiaNeedsOxygen Tweets
kaggle.com
zip
Updated Nov 14, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kash (2021). #IndiaNeedsOxygen Tweets [Dataset]. https://www.kaggle.com/kaushiksuresh147/indianeedsoxygen-tweets
Explore at:
zip(4441094 bytes)Available download formats
Dataset updated
Nov 14, 2021
Authors
Kash
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
India marks one COVID-19 death every 5 minutes

https://ichef.bbci.co.uk/news/976/cpsprodpb/11C98/production/_118165827_gettyimages-1232465340.jpg" alt="">

Content

People across India scrambled for life-saving oxygen supplies on Friday and patients lay dying outside hospitals as the capital recorded the equivalent of one death from COVID-19 every five minutes.

For the second day running, the country’s overnight infection total was higher than ever recorded anywhere in the world since the pandemic began last year, at 332,730.

India’s second wave has hit with such ferocity that hospitals are running out of oxygen, beds, and anti-viral drugs. Many patients have been turned away because there was no space for them, doctors in Delhi said.

https://s.yimg.com/ny/api/res/1.2/XhVWo4SOloJoXaQLrxxUIQ--/YXBwaWQ9aGlnaGxhbmRlcjt3PTk2MA--/https://s.yimg.com/os/creatr-uploaded-images/2021-04/8aa568f0-a3e0-11eb-8ff6-6b9a188e374a" alt="">

Mass cremations have been taking place as the crematoriums have run out of space. Ambulance sirens sounded throughout the day in the deserted streets of the capital, one of India’s worst-hit cities, where a lockdown is in place to try and stem the transmission of the virus. source

Dataset

The dataset consists of the tweets made with the #IndiaWantsOxygen hashtag covering the tweets from the past week. The dataset totally consists of 25,440 tweets and will be updated on a daily basis.

The description of the features is given below | No |Columns | Descriptions | | -- | -- | -- | | 1 | user_name | The name of the user, as they’ve defined it. | | 2 | user_location | The user-defined location for this account’s profile. | | 3 | user_description | The user-defined UTF-8 string describing their account. | | 4 | user_created | Time and date, when the account was created. | | 5 | user_followers | The number of followers an account currently has. | | 6 | user_friends | The number of friends an account currently has. | | 7 | user_favourites | The number of favorites an account currently has | | 8 | user_verified | When true, indicates that the user has a verified account | | 9 | date | UTC time and date when the Tweet was created | | 10 | text | The actual UTF-8 text of the Tweet | | 11 | hashtags | All the other hashtags posted in the tweet along with #IndiaWantsOxygen | | 12 | source | Utility used to post the Tweet, Tweets from the Twitter website have a source value - web | | 13 | is_retweet | Indicates whether this Tweet has been Retweeted by the authenticating user. |

Acknowledgements

https://globalnews.ca/news/7785122/india-covid-19-hospitals-record/ Image courtesy: BBC and Reuters

Inspiration

The past few days have been really depressing after seeing these incidents. These tweets are the voice of the indians requesting help and people all over the globe asking their own countries to support India by providing oxygen tanks.

And I strongly believe that this is not just some data, but the pure emotions of people and their call for help. And I hope we as data scientists could contribute on this front by providing valuable information and insights.
l
Data from: Supplementary information files for Height and body-mass index...
repository.lboro.ac.uk
search.datacite.org
pdf
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NCD Risk Factor Collaboration; Oonagh Markey (2023). Supplementary information files for Height and body-mass index trajectories of school-aged children and adolescents from 1985 to 2019 in 200 countries and territories: a pooled analysis of 2181 population-based studies with 65 million participants [Dataset]. http://doi.org/10.17028/rd.lboro.13241105.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.17028/rd.lboro.13241105.v1
Dataset updated
May 30, 2023
Dataset provided by
Loughborough University
Authors
NCD Risk Factor Collaboration; Oonagh Markey
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Supplementary files for article Supplementary information files for Height and body-mass index trajectories of school-aged children and adolescents from 1985 to 2019 in 200 countries and territories: a pooled analysis of 2181 population-based studies with 65 million participants.BackgroundComparable global data on health and nutrition of school-aged children and adolescents are scarce. We aimed to estimate age trajectories and time trends in mean height and mean body-mass index (BMI), which measures weight gain beyond what is expected from height gain, for school-aged children and adolescents.MethodsFor this pooled analysis, we used a database of cardiometabolic risk factors collated by the Non-Communicable Disease Risk Factor Collaboration. We applied a Bayesian hierarchical model to estimate trends from 1985 to 2019 in mean height and mean BMI in 1-year age groups for ages 5–19 years. The model allowed for non-linear changes over time in mean height and mean BMI and for non-linear changes with age of children and adolescents, including periods of rapid growth during adolescence.FindingsWe pooled data from 2181 population-based studies, with measurements of height and weight in 65 million participants in 200 countries and territories. In 2019, we estimated a difference of 20 cm or higher in mean height of 19-year-old adolescents between countries with the tallest populations (the Netherlands, Montenegro, Estonia, and Bosnia and Herzegovina for boys; and the Netherlands, Montenegro, Denmark, and Iceland for girls) and those with the shortest populations (Timor-Leste, Laos, Solomon Islands, and Papua New Guinea for boys; and Guatemala, Bangladesh, Nepal, and Timor-Leste for girls). In the same year, the difference between the highest mean BMI (in Pacific island countries, Kuwait, Bahrain, The Bahamas, Chile, the USA, and New Zealand for both boys and girls and in South Africa for girls) and lowest mean BMI (in India, Bangladesh, Timor-Leste, Ethiopia, and Chad for boys and girls; and in Japan and Romania for girls) was approximately 9–10 kg/m2. In some countries, children aged 5 years started with healthier height or BMI than the global median and, in some cases, as healthy as the best performing countries, but they became progressively less healthy compared with their comparators as they grew older by not growing as tall (eg, boys in Austria and Barbados, and girls in Belgium and Puerto Rico) or gaining too much weight for their height (eg, girls and boys in Kuwait, Bahrain, Fiji, Jamaica, and Mexico; and girls in South Africa and New Zealand). In other countries, growing children overtook the height of their comparators (eg, Latvia, Czech Republic, Morocco, and Iran) or curbed their weight gain (eg, Italy, France, and Croatia) in late childhood and adolescence. When changes in both height and BMI were considered, girls in South Korea, Vietnam, Saudi Arabia, Turkey, and some central Asian countries (eg, Armenia and Azerbaijan), and boys in central and western Europe (eg, Portugal, Denmark, Poland, and Montenegro) had the healthiest changes in anthropometric status over the past 3·5 decades because, compared with children and adolescents in other countries, they had a much larger gain in height than they did in BMI. The unhealthiest changes—gaining too little height, too much weight for their height compared with children in other countries, or both—occurred in many countries in sub-Saharan Africa, New Zealand, and the USA for boys and girls; in Malaysia and some Pacific island nations for boys; and in Mexico for girls.InterpretationThe height and BMI trajectories over age and time of school-aged children and adolescents are highly variable across countries, which indicates heterogeneous nutritional quality and lifelong health advantages and risks.
o
Supplementary material for the thesis "Volcanic architecture of the Deccan...
ordo.open.ac.uk
doc
Updated Sep 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anne E. Jay (2024). Supplementary material for the thesis "Volcanic architecture of the Deccan Traps, western Maharashtra, India: an integrated chemostratigraphic and paleomagnetic study" [Dataset]. http://doi.org/10.21954/ou.rd.26969632.v1
Explore at:
docAvailable download formats
Unique identifier
https://doi.org/10.21954/ou.rd.26969632.v1
Dataset updated
Sep 9, 2024
Dataset provided by
The Open University
Authors
Anne E. Jay
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Area covered
Maharashtra, India
Description
This collection comprises the folders contained on a CD-ROM which was attached to the thesis when it was submitted in 2005. It was uploaded to ORDO in 2024 for preservation purposes. For more information, please refer to the thesis Volcanic architecture of the Deccan Traps, western Maharashtra, India: an integrated chemostratigraphic and paleomagnetic study on ORO.Thesis abstractDetailed volcanostratigraphic logs of seven traverses up the lava sequence in the Western Ghats, Deccan Traps, India, are presented. The main study area, the Mahabaleshwar Plateau, was chosen because the lavas were emplaced around the time of the Cretaceous-Tertiary Boundary and because there is access to exposed lavas on three of its four sides, permitting investigation of the volcanic architecture in 3-D. Besides characteristics of the lava units, the logs include integrated geochemical and palaeomagnetic samples. The lava pile is dominated by pthoehoe sheet lobes and smaller lobes and toes. It can be divided into flow-fields, the products of one eruption, by the occurrence of weathering horizons. Palaeomagnetic results demonstrate that the chron 29R/29N reversal boundary horizon occurs in all four of the traverses around the Plateau and nearby Khumbarli Ghat. The elevation of the reversal horizon on each traverse varies between 897-945 m and 982 m, a value greater than that predicted by the small regional dip. Statistical analysis of geochemical data from samples taken between the reversal horizon and the base of the Mahabaleshwar Formation do not show any apparent correlation around the Mahabaleshwar Plateau, indicating that individual sheet lobes are less than 20 km wide. Determining the lateral extent of flow-fields is not possible using this method but from the occurrence of a similar number of flow-fields in three traverses of similar length round the Plateau, it is probable that most flow fields are at least as wide as the Mahabaleshwar Plateau (more than 20 km). Comparing the thickness of the lava pile between the base of the Mahabaleshwar Formation, the palaeomagnetic reversal horizon and the laterite cap, shows that as much as 95m of topography occurred on the surface of the active Deccan lavas over a distance of approximately 20 km. The volcanic architecture is controlled by the morphology of small sheet lobes, large sheet lobes, and, on a larger scale, flow-fields. These observations, and the varying number of individual sheet lobes making up flow-fields, demonstrates that the structure of the Deccan lava province at the level of eruptive units is extremely complex.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Matteo Cancellieri; Nancy Pontika; David Pride; Petr Knoth; Hannah Metzler; Antonia Correia; Helene Brinken; Bikash Gyawali (2022). Career promotions, research publications, Open Access dataset [Dataset]. http://doi.org/10.21954/ou.rd.19228785.v1

Career promotions, research publications, Open Access dataset

Explore at:

2 scholarly articles cite this dataset (View in Google Scholar)

zipAvailable download formats

Unique identifier

https://doi.org/10.21954/ou.rd.19228785.v1

Dataset updated

Feb 28, 2022

Dataset provided by

The Open University

Authors

Matteo Cancellieri; Nancy Pontika; David Pride; Petr Knoth; Hannah Metzler; Antonia Correia; Helene Brinken; Bikash Gyawali

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset is a compilation of processed data on citation and references for research papers including their author, institution and open access info for a selected sample of academics analysed using Microsoft Academic Graph (MAG) data and CORE. The data for this dataset was collected during December 2019 to January 2020.Six countries (Austria, Brazil, Germany, India, Portugal, United Kingdom and United States) were the focus of the six questions which make up this dataset. There is one csv file per country and per question (36 files in total). More details about the creation of this dataset are available on the public ON-MERRIT D3.1 deliverable report.The dataset is a combination of two different data sources, one part is a dataset created on analysing promotion policies across the target countries, while the second part is a set of data points available to understand the publishing behaviour. To facilitate the analysis the dataset is organised in the following seven folders:PRTThe dataset with the file name "PRT_policies.csv" contains the related information as this was extracted from promotion, review and tenure (PRT) policies. Q1: What % of papers coming from a university are Open Access?- Dataset Name format: oa_status_countryname_papers.csv- Dataset Contents: Open Access (OA) status of all papers of all the universities listed in Times Higher Education World University Rankings (THEWUR) for the given country. A paper is marked OA if there is at least an OA link available. OA links are collected using the CORE Discovery API.- Important considerations about this dataset: - Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to. - The service we used to recognise if a paper is OA, CORE Discovery, does not contain entries for all paperids in MAG. This implies that some of the records in the dataset extracted will not have either a true or false value for the _is_OA_ field. - Only those records marked as true for _is_OA_ field can be said to be OA. Others with false or no value for is_OA field are unknown status (i.e. not necessarily closed access).Q2: How are papers, published by the selected universities, distributed across the three scientific disciplines of our choice?- Dataset Name format: fsid_countryname_papers.csv- Dataset Contents: For the given country, all papers for all the universities listed in THEWUR with the information of fieldofstudy they belong to.- Important considerations about this dataset: * MAG can associate a paper to multiple fieldofstudyid. If a paper belongs to more than one of our fieldofstudyid, separate records were created for the paper with each of those _fieldofstudyid_s.- MAG assigns fieldofstudyid to every paper with a score. We preserve only those records whose score is more than 0.5 for any fieldofstudyid it belongs to.- Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to. Papers with authorship from multiple universities are counted once towards each of the universities concerned.Q3: What is the gender distribution in authorship of papers published by the universities?- Dataset Name format: author_gender_countryname_papers.csv- Dataset Contents: All papers with their author names for all the universities listed in THEWUR.- Important considerations about this dataset :- When there are multiple collaborators(authors) for the same paper, this dataset makes sure that only the records for collaborators from within selected universities are preserved.- An external script was executed to determine the gender of the authors. The script is available here.Q4: Distribution of staff seniority (= number of years from their first publication until the last publication) in the given university.- Dataset Name format: author_ids_countryname_papers.csv- Dataset Contents: For a given country, all papers for authors with their publication year for all the universities listed in THEWUR.- Important considerations about this work :- When there are multiple collaborators(authors) for the same paper, this dataset makes sure that only the records for collaborators from within selected universities are preserved.- Calculating staff seniority can be achieved in various ways. The most straightforward option is to calculate it as _academic_age = MAX(year) - MIN(year) _for each authorid.Q5: Citation counts (incoming) for OA vs Non-OA papers published by the university.- Dataset Name format: cc_oa_countryname_papers.csv- Dataset Contents: OA status and OA links for all papers of all the universities listed in THEWUR and for each of those papers, count of incoming citations available in MAG.- Important considerations about this dataset :- CORE Discovery was used to establish the OA status of papers.- Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to.- Only those records marked as true for _is_OA_ field can be said to be OA. Others with false or no value for is_OA field are unknown status (i.e. not necessarily closed access).Q6: Count of OA vs Non-OA references (outgoing) for all papers published by universities.- Dataset Name format: rc_oa_countryname_-papers.csv- Dataset Contents: Counts of all OA and unknown papers referenced by all papers published by all the universities listed in THEWUR.- Important considerations about this dataset :- CORE Discovery was used to establish the OA status of papers being referenced.- Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to. Papers with authorship from multiple universities are counted once towards each of the universities concerned.Additional files:- _fieldsofstudy_mag_.csv: this file contains a dump of fieldsofstudy table of MAG mapping each of the ids to their actual field of study name.

Clear search

Close search

Google apps

Main menu

Career promotions, research publications, Open Access dataset

Success.ai | LinkedIn Data | 700M Public Profiles & 70M Companies – Best...

Generative AI In Coding Market Analysis, Size, and Forecast 2025-2029: North...

Snapshot img

#IndiaNeedsOxygen Tweets

India marks one COVID-19 death every 5 minutes

Content

Dataset

Acknowledgements

Inspiration

Data from: Supplementary information files for Height and body-mass index...

Supplementary material for the thesis "Volcanic architecture of the Deccan...

Career promotions, research publications, Open Access dataset