ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
This dataset shows the location of Higher Education (HE) and Further Education (FE) institutes in the Great Britain. This should cover Universities and Colleges. Many institutes have more than one campus and where possible this is refelcted in the data so a University may have more than one entry. Postcodes have also been included for instities where possible. This data was collected from various sources connected with HEFE in the UK including JISC and EDINA. This represents the fullest list that the author could compile from various sources. If you spot a missing institution, please contact the author and they will add it to the dataset. GIS vector data. This dataset was first accessioned in the EDINA ShareGeo Open repository on 2011-02-01 and migrated to Edinburgh DataShare on 2017-02-21.
The USR consists of records of undergraduate students on courses of one academic year or more; postgraduate students on courses of one academic year or more; academic and related staff holding regular salaried appointments, and finance data for all UK universities.
The Finance dataset contains details of income and expenditure for all of the UK universities. These data are contained in a series of files for each year. For detailed information on structure and content of these files users should refer to the documentation that accompanies this dataset. Also included in the Finance dataset is the Student Load data. Student Load is, in the USR context, a reallocation of student-head count numbers, by apportioning them as a percentage to the departmental cost centres where they are taught, thus enabling student load, staff and financial data to be brought together.
DOI Abstract copyright UK Data Service and data collection copyright owner.The USR consists of records of undergraduate students on courses of one academic year or more; postgraduate students on courses of one academic year or more; academic and related staff holding regular salaried appointments, and finance data for all UK universities. The Finance dataset contains details of income and expenditure for all of the UK universities. These data are contained in a series of files for each year. For detailed information on structure and content of these files users should refer to the documentation that accompanies this dataset. Also included in the Finance dataset is the Student Load data. Student Load is, in the USR context, a reallocation of student-head count numbers, by apportioning them as a percentage to the departmental cost centres where they are taught, thus enabling student load, staff and financial data to be brought together. Main Topics: Finance: income and expenditure; university; cost centre. Student load: undergraduate, postgraduate (taught course or research); cost centre. No information recorded Annual returns from each university.
The GLA commissioned the Social Market Foundation to look at the reasons behind the non-continuation (drop-out) rate of undergraduates studying at London’s higher education institutions. This report seeks to understand the factors affecting non-continuation and transfers at London universities. London’s non-continuation rate is 7.7%, which is much higher than the English average of 6.3%, and students in London are the most likely to transfer to another university compared to students in the rest of the country. We seek to build on previous SMF work by focusing on why students leave university in London and the report looks in-depth at the differences in retention by ethnicity and socio-economic status. This report draws on qualitative and quantitative evidence. Interviews were conducted with 20 individuals from London who attended and withdrew from a London university and quantitative analysis of HESA data on young students in London between 2013/14 and 2015/16.
This dataset presents a cluster analysis of UK universities based on four synthetic environments: social, cultural, physical and economic. These were developed based on variables that represented an educational ecosystem of well-being. The cluster analysis was initially linked to the LSYPE-Secure dataset using the UKPRNs (i.e. higher education institutional number) and hence the cluster analysis used data from around 2009-2012 to represent Wave 6 and Wave 7 of the LSYPE-Secure dataset. The cluster analysis was based on using a variety of variables available from HESA and the Office for Students (OfS) to represent these environments, for example: Social: had demographics of students and staff including ethnicity and sex Cultural: had data on research and teaching scores Economic: had data on student: staff ratio and expenditure Physical: had data related to the built and natural environment including residential sites, blue and green spacesEarlier last year (April 2018), the UK Office for Students (OfS) noted that students from underrepresented groups such as black and minority ethnic (BME) students and those from disadvantaged backgrounds were less likely to succeed at university. Coupled with this, research has shown that students from these groups are also more likely to have poorer mental health and wellbeing. However, there is substantial social and political pressure on universities to act to improve student mental health. For example, the Telegraph ran the headline "Do British universities have a suicide problem?" Thus, in June 2018, the Hon. Sam Gyimah, the then UK universities minister, informed university vice-chancellors that student mental health and wellbeing has to be one of their top priorities. Universities are investing substantive sums in activities to tackle student mental health but doing so with no evidence base to guide strategic policy and practice. These activities may potentially be ineffective, financially wasteful, and possibly, counter-productive. Therefore, we need a better evidence base which this project intends to fulfil. Currently, there is a lack of evidence and understanding about which groups of young people going to universities may have poorer life outcomes (such as education, employment, and mental health and well-being) as a result of their mental health and wellbeing during their adolescent years. These life outcomes and their mental health and wellbeing, however, are important for understanding the context of the complex social identities of the young people, such as the intersections between their gender, ethnicity, sexuality, religion and socio-economic status. Otherwise, these young people may feel misunderstood or judged. Most of the large body of quantitative research on life outcomes tend to focus on one social characteristic/identity of the student, such as the young person's gender or ethnicity or socio-economic status, but not the combination of all of these, i.e. the intersectionalities. Primarily, the reason for this has been the lack of sufficient data. This research draws on data from the Longitudinal Study of Young People in England (LSYPE), which tracked over 15,000 adolescents' education and health over 7 years between 2004-2010 (from when they were 13-19 years old), and the Next Steps Survey, which collected data from the same individuals in 2015 when they were 25 years and in the job market. This dataset also had an ethnic boost, which thus allows for the exploratory analysis of intersectionalities. Currently, there are a number of interventions being implemented to improve the university environment. However, there is a lack of evidence on how the university environment (such as their its size, amount of academic support available, availability of sports activities, students' sense of belonging, etc.) can affect the young person'students' mental health and wellbeing life outcomes. This evidence can be determined through by using the LSYPE data supplemented and by university environment data supplemented from the National Student Survey (NSS) and the Higher Education Statistics Agency (HESA). Thus this research uses an intersectional approach to investigate the extent to which the life outcomes of young persons who go to university are affected by their social inequality groupings and mental health and well-being during adolescence. Additionally, this research also aims to determine the characteristics of university environments that can improve the life outcomes of these young people depending on their social and mental health/wellbeing background. We use secondary data analysis of mainly HESA and OfS variables and created derived variables.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is a compilation of processed data on citation and references for research papers including their author, institution and open access info for a selected sample of academics analysed using Microsoft Academic Graph (MAG) data and CORE. The data for this dataset was collected during December 2019 to January 2020.Six countries (Austria, Brazil, Germany, India, Portugal, United Kingdom and United States) were the focus of the six questions which make up this dataset. There is one csv file per country and per question (36 files in total). More details about the creation of this dataset are available on the public ON-MERRIT D3.1 deliverable report.The dataset is a combination of two different data sources, one part is a dataset created on analysing promotion policies across the target countries, while the second part is a set of data points available to understand the publishing behaviour. To facilitate the analysis the dataset is organised in the following seven folders:PRTThe dataset with the file name "PRT_policies.csv" contains the related information as this was extracted from promotion, review and tenure (PRT) policies. Q1: What % of papers coming from a university are Open Access?- Dataset Name format: oa_status_countryname_papers.csv- Dataset Contents: Open Access (OA) status of all papers of all the universities listed in Times Higher Education World University Rankings (THEWUR) for the given country. A paper is marked OA if there is at least an OA link available. OA links are collected using the CORE Discovery API.- Important considerations about this dataset: - Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to. - The service we used to recognise if a paper is OA, CORE Discovery, does not contain entries for all paperids in MAG. This implies that some of the records in the dataset extracted will not have either a true or false value for the _is_OA_ field. - Only those records marked as true for _is_OA_ field can be said to be OA. Others with false or no value for is_OA field are unknown status (i.e. not necessarily closed access).Q2: How are papers, published by the selected universities, distributed across the three scientific disciplines of our choice?- Dataset Name format: fsid_countryname_papers.csv- Dataset Contents: For the given country, all papers for all the universities listed in THEWUR with the information of fieldofstudy they belong to.- Important considerations about this dataset: * MAG can associate a paper to multiple fieldofstudyid. If a paper belongs to more than one of our fieldofstudyid, separate records were created for the paper with each of those _fieldofstudyid_s.- MAG assigns fieldofstudyid to every paper with a score. We preserve only those records whose score is more than 0.5 for any fieldofstudyid it belongs to.- Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to. Papers with authorship from multiple universities are counted once towards each of the universities concerned.Q3: What is the gender distribution in authorship of papers published by the universities?- Dataset Name format: author_gender_countryname_papers.csv- Dataset Contents: All papers with their author names for all the universities listed in THEWUR.- Important considerations about this dataset :- When there are multiple collaborators(authors) for the same paper, this dataset makes sure that only the records for collaborators from within selected universities are preserved.- An external script was executed to determine the gender of the authors. The script is available here.Q4: Distribution of staff seniority (= number of years from their first publication until the last publication) in the given university.- Dataset Name format: author_ids_countryname_papers.csv- Dataset Contents: For a given country, all papers for authors with their publication year for all the universities listed in THEWUR.- Important considerations about this work :- When there are multiple collaborators(authors) for the same paper, this dataset makes sure that only the records for collaborators from within selected universities are preserved.- Calculating staff seniority can be achieved in various ways. The most straightforward option is to calculate it as _academic_age = MAX(year) - MIN(year) _for each authorid.Q5: Citation counts (incoming) for OA vs Non-OA papers published by the university.- Dataset Name format: cc_oa_countryname_papers.csv- Dataset Contents: OA status and OA links for all papers of all the universities listed in THEWUR and for each of those papers, count of incoming citations available in MAG.- Important considerations about this dataset :- CORE Discovery was used to establish the OA status of papers.- Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to.- Only those records marked as true for _is_OA_ field can be said to be OA. Others with false or no value for is_OA field are unknown status (i.e. not necessarily closed access).Q6: Count of OA vs Non-OA references (outgoing) for all papers published by universities.- Dataset Name format: rc_oa_countryname_-papers.csv- Dataset Contents: Counts of all OA and unknown papers referenced by all papers published by all the universities listed in THEWUR.- Important considerations about this dataset :- CORE Discovery was used to establish the OA status of papers being referenced.- Papers with multiple authorship are preserved only once towards each of the distinct institutions their authors may belong to. Papers with authorship from multiple universities are counted once towards each of the universities concerned.Additional files:- _fieldsofstudy_mag_.csv: this file contains a dump of fieldsofstudy table of MAG mapping each of the ids to their actual field of study name.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
This dataset provides Census 2021 estimates that classify schoolchildren and full-time students aged 5 years and over in England and Wales by student accommodation and by age. The estimates are as at Census Day, 21 March 2021.
Estimates for single year of age between ages 90 and 100+ are less reliable than other ages. Estimation and adjustment at these ages was based on the age range 90+ rather than five-year age bands. Read more about this quality notice.
Area type
Census 2021 statistics are published for a number of different geographies. These can be large, for example the whole of England, or small, for example an output area (OA), the lowest level of geography for which statistics are produced.
For higher levels of geography, more detailed statistics can be produced. When a lower level of geography is used, such as output areas (which have a minimum of 100 persons), the statistics produced have less detail. This is to protect the confidentiality of people and ensure that individuals or their characteristics cannot be identified.
Coverage
Census 2021 statistics are published for the whole of England and Wales. Data are also available in these geographic types:
Student accommodation type
Combines the living situation of students and school children in full-time education, whether they are living:
It also includes whether these households contain one or multiple families.
This variable is comparable with the student accommodation variable but splits the communal establishment type into “university” and “other” categories.
Age
A person’s age on Census Day, 21 March 2021 in England and Wales. Infants aged under 1 year are classified as 0 years of age.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Open University (OU) dataset is an open database containing student demographic and click-stream interaction with the virtual learning platform. The available data are structured in different CSV files. You can find more information about the original dataset at the following link: https://analyse.kmi.open.ac.uk/open_dataset.
We extracted a subset of the original dataset that focuses on student information. 25,819 records were collected referring to a specific student, course and semester. Each record is described by the following 20 attributes: code_module, code_presentation, gender, highest_education, imd_band, age_band, num_of_prev_attempts, studies_credits, disability, resource, homepage, forum, glossary, outcontent, subpage, url, outcollaborate, quiz, AvgScore, count.
Two target classes were considered, namely Fail and Pass, combining the original four classes (Fail and Withdrawn and Pass and Distinction, respectively). The final_result attribute contains the target values.
All features have been converted to numbers for automatic processing.
Below is the mapping used to convert categorical values to numeric:
For more detailed information, please refer to:
Casalino G., Castellano G., Vessio G. (2021) Exploiting Time in Adaptive Learning from Educational Data. In: Agrati L.S. et al. (eds) Bridges and Mediation in Higher Distance Education. HELMeTO 2020. Communications in Computer and Information Science, vol 1344. Springer, Cham. https://doi.org/10.1007/978-3-030-67435-9_1
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
This dataset provides Census 2022 estimates for distance travelled to place of study of people aged 4 and over studying by age (in 4 categories) in Scotland.
A person's age on Census Day, 20 March 2022. Infants aged under 1 year are classified as 0 years of age.
The distance between a person’s home address and their main place of work or study (Grouped).
Address of place of work or study is used (along with home address) to explore the relationship between where people live and where they work or study. Used in conjunction with information from the method of travel question, the data helps to identify commuter patterns and routes and provide a reliable indicator for the demands placed on public and private transport.
It is used to inform the balance of housing and jobs in particular areas and assess the need for services such as new schools. Information on where people live and work is used by government departments to define “Travel to Work Areas” - these are approximations of self-contained labour markets and are the smallest areas for which unemployment rates are published. Collecting information on both work and study address enables a more accurate count of daytime populations to be obtained, which is particularly useful for areas accommodating universities and businesses. It also allows the differences in travel patterns between these groups to be compared.
Details of classification can be found here
The quality assurance report can be found here
Abstract copyright UK Data Service and data collection copyright owner.The USR consists of records of undergraduate students on courses of one academic year or more; postgraduate students on courses of one academic year or more; academic and related staff holding regular salaried appointments, and finance data for all UK universities. No information recorded Annual returns from each university.
The aim of the research is to provide an empirically based understanding of the Net Generation as they enter university. The research uses a mixture of survey methods, interview and observation to achieve the following objectives: (1)To explore their attitudes, expectations and experience of e-learning at university; (2)To explore any linkages between their prior exposure to gaming and digital networked technology and their expressed attitudes towards and experience of e-learning; (3)To investigate the use of social software;(4)To develop the theoretical basis for understanding any generational changes; (5)To provide timely evidence based advice for policy makers, teaching staff and administrators. This research will aim to explore students coming from the Net generation as they first encounter e-learning at university. The Net Generation are distinct as they grew up with games and digital technologies. They are distinct in ways that have a relevance to teaching and learning, including questions related to attention span and information searching patterns. At the same time universities in the UK have been exploring a more extensive use of e-learning. The policy direction emphasizes learners’ needs and aspirations but we have little empirical evidence of the changing student population. The collection consists of Electronic/paper surveys (3), telephone interviews (80 interviewees, 79 interviews), cultural probe (involving 19 students) and 4 focus groups(4). Combination of one-time (Survey 1) and repeated study (Surveys 2 and 3). The collection contains both qualitative and quantitative data. Quantitative Data: Number of survey databases: 3. Survey 1 Database: 256 variables; 596 cases. Survey 2 Database: 124 variables; 1099 cases. Survey 3 Database: 127 variables; 716 cases. Qualitative Data: Interview transcripts: 79 documents (transcripts from 3 interviews attached); interview questions: 3 documents; cultural probe: 19 documents containing transcripts of videos and notebook entries; focus group transcripts: 4 documents. The studied population were 1st year students at 5 English universities and their staff. Number of students taking courses surveyed: 2415. Number of students interviewed: 68. Number of staff interviewed: 12.
The datasets provided by UK based online learning university "Open University". More about the dataset: https://analyse.kmi.open.ac.uk/open_dataset
For the academic year of 2024/2025, the University of Oxford was ranked as the best university in the world, with an overall score of 98.5 according the Times Higher Education. The Massachusetts Institute of Technology and Harvard University followed behind. A high number of the leading universities in the world are located in the United States, with the ETH Zürich in Switzerland the highest ranked neither in the United Kingdom nor the U.S.
1965 Coastal Land Use Data. Created from physical survey carried out by University of Reading. Project details: https://www.nationaltrust.org.uk/documents/mapping-our-shores-fifty-years-of-land-use-change-at-the-coast.pdf In 1965, concerned about the impact of development along the coast, the National Trust launched ‘Enterprise Neptune’ to help raise money to buy and protect the most ‘pristine’ stretches. In order to understand which areas were most at risk from development, University of Reading staff & students were commissioned to carry out a physical coastal land use survey that was lovingly recorded on 350 OS 2.5 miles to 1 inch scale maps.Half a century later, the Neptune Coastline Campaign, has raised £65 million, enabling the National Trust to acquire an additional 550 miles of coastline to a total of 775 miles. To celebrate this milestone the Trust commissioned the University of Leicester to re-survey the land use along the coast with a desktop methodology that focused on change (2014 Coastal Land Use dataset).For more information on the creation of the Land Use datasets see: http://onlinelibrary.wiley.com/doi/10.1111/tran.12128/abstract
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
I recently submitted my dissertation for my MSc in Business Analytics titled: Understanding & Predicting Student Rental Prices in a U.K. city: Machine Learning & Traditional Methods.
I chose this dissertation research area due to the lacking literature investigating U.K. rental dynamics (particularly in Northern Ireland) and due to the real and very current issue of rising rent felt in Belfast by students.
Based on a selection of 36 property variables such as geographic location, bedroom number & property size - I built multiple machine learning models to predict the price of rent and to understand the most important variables in selected models.
No existing dataset was available that combined all the required information for Belfast and therefore I chose to complete the task of data mining and cleaning the information, pulling it all into one dataset. I sourced the info from Property Pal and Property News. Please check the dataset as there may be minor repetition or some columns which should not be used.
Finally, I leveraged these findings into an interactive dashboard (you can view a link below) which enables students to view all available properties and determine which one has the required features alongside appropriate pricing.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
RESPOND project produced a high level of empirical material in 11 countries (Sweden, the UK, Germany, Italy, Poland, Austria, Greece, Bulgaria, Turkey, Iraq, and Lebanon) where the research is conducted between the period 2017-2020. The country teams gathered macro (policies), meso (implementation/stakeholders) and micro (individuals/asylum seekers and refuges) level data related to the thematic fields formulated in four work packages: borders, protection regimes, reception, and integration. An important contribution of this research has been its micro/individual focus which enabled the research teams to capture and understand the migration experiences of asylum seekers and refugees and their responses to the policies and obstacles that they have encountered.
Country teams conducted in total 539 interviews with refugees and asylum seekers, and more than 210 interviews with stakeholders (state and non-state actors) working in the field of migration. Additionally, the project has conducted a survey study in Sweden and Turkey (n=700 in each country), covering similar topics.
This dataset is only about the micro part of the Respond research, and reflects data derived out of 539 interviews conducted with asylum seekers and refugees in 11 countries and here presented in a quantitative form. The whole dataset is structured along the work package topics: Border, Protection, Reception and Integration.
This dataset is prepared as part of Work Package D4.4 (Dataset on Reception) the Horizon 2020 RESPOND project as a joint effort of the below listed project partners.
https://www.insight.hdrhub.org/https://www.insight.hdrhub.org/
Background Glaucoma is a worldwide leading cause of irreversible sight loss. Worldwide, an estimated 60 million people have glaucoma. Glaucoma is a condition of increased intraocular pressure in the eye. Because it may be asymptomatic until a relatively late stage, diagnosis is frequently delayed. There are four general categories of glaucoma: primary open-angle and angle-closure, and secondary open and angle-closure glaucoma.
The UHB glaucoma dataset is a longitudinal dataset consisting of routinely collected clinical metadata from patients receiving treatment for glaucoma at UHB, from 2007 to the present.
This dataset encompasses all patients at UHB who have received a diagnosis of primary or secondary glaucoma or ocular hypertension. Clinical metadata includes demographic information, visual acuities, central corneal thickness, intraocular pressure, optic nerve head findings, and mean deviation of the Humphrey visual fields.
This dataset is continuously updating, however, as of 1st October 2021, it consisted of 5065 people This is a large single centre database from patients with glaucoma and covers more than a decade of follow-up for these patients.
Geography The Queen Elizabeth Hospital is one of the largest single-site hospitals in the United Kingdom, with 1,215 inpatient beds. Queen Elizabeth Hospital is part of one of the largest teaching trusts in England (University Hospitals Birmingham). Set within the West Midlands and it has a catchment population of circa 5.9million. The region includes a diverse ethnic, and socio-economic mix, with a higher than UK average of minority ethnic groups. It has a large number of elderly residents but is the youngest population in the UK. There are particularly high rates of diabetes, physical inactivity, obesity, and smoking.
Data source: Ophthalmology department at Queen Elizabeth Hospital, University Hospitals Birmingham NHS Foundation Trust, Birmingham, United Kingdom.
Reporting lower environmental impacts and higher growth potential compared to traditional inshore farms, offshore mussel farming has the potential to become one of the most sustainable, large-scale sources of healthy protein. By annually monitoring the UK’s first offshore, long-line mussel farm since it was first developed in 2013 in Lyme Bay UK, the University of Plymouth has used ecological and oceanographic techniques to evidence how the farm has delivered increases in pelagic, epi-benthic and infaunal biodiversity. The Ropes to Reef project will further assess the ecosystem services and benefits offshore mussel farming and assess the restoration of essential fish habitat (EFH), biodiversity and associated healthy fish stocks (biomass). The project will also aim to quantify the connectivity of these ecosystem services and its connectivity with the adjacent MPA and spillover effect to fishing grounds. This project’s methodology is based on a multi-trophic level approach combining ecological and oceanography techniques. The project will use non-destructive remote sampling techniques such as an echosounder, multibeam and ground truthing cameras deployed from local fishing boats, to produce high resolution data on the biodiversity and extent of essential fish habitat and associated mobile species. Fishes and crustaceans will also be tracked using acoustic tags via the world’s first multi-farm (mussel, scallop, and seaweed) aquaculture telemetry network. This dataset is relating to the acoustic telemetry aspect of the study.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
We are publishing a walking activity dataset including inertial and positioning information from 19 volunteers, including reference distance measured using a trundle wheel. The dataset includes a total of 96.7 Km walked by the volunteers, split into 203 separate tracks. The trundle wheel is of two types: it is either an analogue trundle wheel, which provides the total amount of meters walked in a single track, or it is a sensorized trundle wheel, which measures every revolution of the wheel, therefore recording a continuous incremental distance.
Each track has data from the accelerometer and gyroscope embedded in the phones, location information from the Global Navigation Satellite System (GNSS), and the step count obtained by the device. The dataset can be used to implement walking distance estimation algorithms and to explore data quality in the context of walking activity and physical capacity tests, fitness, and pedestrian navigation.
Methods
The proposed dataset is a collection of walks where participants used their own smartphones to capture inertial and positioning information. The participants involved in the data collection come from two sites. The first site is the Oxford University Hospitals NHS Foundation Trust, United Kingdom, where 10 participants (7 affected by cardiovascular diseases and 3 healthy individuals) performed unsupervised 6MWTs in an outdoor environment of their choice (ethical approval obtained by the UK National Health Service Health Research Authority protocol reference numbers: 17/WM/0355). All participants involved provided informed consent. The second site is at Malm ̈o University, in Sweden, where a group of 9 healthy researchers collected data. This dataset can be used by researchers to develop distance estimation algorithms and how data quality impacts the estimation.
All walks were performed by holding a smartphone in one hand, with an app collecting inertial data, the GNSS signal, and the step counting. On the other free hand, participants held a trundle wheel to obtain the ground truth distance. Two different trundle wheels were used: an analogue trundle wheel that allowed the registration of a total single value of walked distance, and a sensorized trundle wheel which collected timestamps and distance at every 1-meter revolution, resulting in continuous incremental distance information. The latter configuration is innovative and allows the use of temporal windows of the IMU data as input to machine learning algorithms to estimate walked distance. In the case of data collected by researchers, if the walks were done simultaneously and at a close distance from each other, only one person used the trundle wheel, and the reference distance was associated with all walks that were collected at the same time.The walked paths are of variable length, duration, and shape. Participants were instructed to walk paths of increasing curvature, from straight to rounded. Irregular paths are particularly useful in determining limitations in the accuracy of walked distance algorithms. Two smartphone applications were developed for collecting the information of interest from the participants' devices, both available for Android and iOS operating systems. The first is a web-application that retrieves inertial data (acceleration, rotation rate, orientation) while connecting to the sensorized trundle wheel to record incremental reference distance [1]. The second app is the Timed Walk app [2], which guides the user in performing a walking test by signalling when to start and when to stop the walk while collecting both inertial and positioning data. All participants in the UK used the Timed Walk app.
The data collected during the walk is from the Inertial Measurement Unit (IMU) of the phone and, when available, the Global Navigation Satellite System (GNSS). In addition, the step count information is retrieved by the sensors embedded in each participant’s smartphone. With the dataset, we provide a descriptive table with the characteristics of each recording, including brand and model of the smartphone, duration, reference total distance, types of signals included and additionally scoring some relevant parameters related to the quality of the various signals. The path curvature is one of the most relevant parameters. Previous literature from our team, in fact, confirmed the negative impact of curved-shaped paths with the use of multiple distance estimation algorithms [3]. We visually inspected the walked paths and clustered them in three groups, a) straight path, i.e. no turns wider than 90 degrees, b) gently curved path, i.e. between one and five turns wider than 90 degrees, and c) curved path, i.e. more than five turns wider than 90 degrees. Other features relevant to the quality of collected signals are the total amount of time above a threshold (0.05s and 6s) where, respectively, inertial and GNSS data were missing due to technical issues or due to the app going in the background thus losing access to the sensors, sampling frequency of different data streams, average walking speed and the smartphone position. The start of each walk is set as 0 ms, thus not reporting time-related information. Walks locations collected in the UK are anonymized using the following approach: the first position is fixed to a central location of the city of Oxford (latitude: 51.7520, longitude: -1.2577) and all other positions are reassigned by applying a translation along the longitudinal and latitudinal axes which maintains the original distance and angle between samples. This way, the exact geographical location is lost, but the path shape and distances between samples are maintained. The difference between consecutive points “as the crow flies” and path curvature was numerically and visually inspected to obtain the same results as the original walks. Computations were made possible by using the Haversine Python library.
Multiple datasets are available regarding walking activity recognition among other daily living tasks. However, few studies are published with datasets that focus on the distance for both indoor and outdoor environments and that provide relevant ground truth information for it. Yan et al. [4] introduced an inertial walking dataset within indoor scenarios using a smartphone placed in 4 positions (on the leg, in a bag, in the hand, and on the body) by six healthy participants. The reference measurement used in this study is a Visual Odometry System embedded in a smartphone that has to be worn at the chest level, using a strap to hold it. While interesting and detailed, this dataset lacks GNSS data, which is likely to be used in outdoor scenarios, and the reference used for localization also suffers from accuracy issues, especially outdoors. Vezovcnik et al. [5] analysed estimation models for step length and provided an open-source dataset for a total of 22 km of only inertial walking data from 15 healthy adults. While relevant, their dataset focuses on steps rather than total distance and was acquired on a treadmill, which limits the validity in real-world scenarios. Kang et al. [6] proposed a way to estimate travelled distance by using an Android app that uses outdoor walking patterns to match them in indoor contexts for each participant. They collect data outdoors by including both inertial and positioning information and they use average values of speed obtained by the GPS data as reference labels. Afterwards, they use deep learning models to estimate walked distance obtaining high performances. Their results share that 3% to 11% of the data for each participant was discarded due to low quality. Unfortunately, the name of the used app is not reported and the paper does not mention if the dataset can be made available.
This dataset is heterogeneous under multiple aspects. It includes a majority of healthy participants, therefore, it is not possible to generalize the outcomes from this dataset to all walking styles or physical conditions. The dataset is heterogeneous also from a technical perspective, given the difference in devices, acquired data, and used smartphone apps (i.e. some tests lack IMU or GNSS, sampling frequency in iPhone was particularly low). We suggest selecting the appropriate track based on desired characteristics to obtain reliable and consistent outcomes.
This dataset allows researchers to develop algorithms to compute walked distance and to explore data quality and reliability in the context of the walking activity. This dataset was initiated to investigate the digitalization of the 6MWT, however, the collected information can also be useful for other physical capacity tests that involve walking (distance- or duration-based), or for other purposes such as fitness, and pedestrian navigation.
The article related to this dataset will be published in the proceedings of the IEEE MetroXRAINE 2024 conference, held in St. Albans, UK, 21-23 October.
This research is partially funded by the Swedish Knowledge Foundation and the Internet of Things and People research center through the Synergy project Intelligent and Trustworthy IoT Systems.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The nature of designing as well as the professional characteristics of many designers leave them vulnerable to the delay of tasks and decisions also known as procrastination. Procrastination is not discussed in design literature. Procrastination is defined as the voluntary delay or inability to complete a task or make a decision. It is often linked to the individual being overwhelmed. The dataset submitted was from a questionnaire that asked about the frequency and form of procrastination; and, influences on their behaviour when trying to undertake stages of a design process was completed by 155 design students and staff within a UK design and creative arts school. The stages included: literature review, ideation, prototyping, and report writing. The outcomes suggested chronic procrastination related to all stages of a design process, with a frequency of more than once a week. Additional questions highlighted multiple tasks were likely to overwhelm the respondents, whilst distractions such as new projects were likely to result in completing alternative tasks. An additional open question provided qualifying comments suggesting procrastination wasn’t explicitly addressed in academic design training. Two key activities to reduce the effects of procrastination were suggested: 1) prioritise tasks; and 2) reduce complexity of each task. Additional advice included: development of professional self-confidence, realistic goal planning, minimising external stimulus, controlling workflows, working in study groups, developing virtuous routines at optimal times during the day, the management of reward and consequence; and use of technology to optimise self-regulation.
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
This dataset shows the location of Higher Education (HE) and Further Education (FE) institutes in the Great Britain. This should cover Universities and Colleges. Many institutes have more than one campus and where possible this is refelcted in the data so a University may have more than one entry. Postcodes have also been included for instities where possible. This data was collected from various sources connected with HEFE in the UK including JISC and EDINA. This represents the fullest list that the author could compile from various sources. If you spot a missing institution, please contact the author and they will add it to the dataset. GIS vector data. This dataset was first accessioned in the EDINA ShareGeo Open repository on 2011-02-01 and migrated to Edinburgh DataShare on 2017-02-21.