Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Historical dataset of population level and growth rate for the Dallas-Fort Worth metro area from 1950 to 2025.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
ALL FILES ARE LOCATED AT MY REPOSITORY: https://github.com/christianio123/TexasAttendance
I was curious about factors affecting school attendance so I gathered data from school districts around Texas to have a better idea.
The purpose of the project is to help determine factors associated with student attendance in the state of Texas. No population is targeted as an audience for the project, however, anyone associated in education may find the dataset used (and other data attained but not used) helpful in any questions they may have regarding student attendance in Texas for the first two months of the 2020-2021 academic school year. This topic was targeted specifically due to the abnormalities in the current academic school year.
Majority of the data in this project was collected by school districts around the state of Texas, public census information, and public COVID 19 data. To attain student attendance information, an email was sent out to 40 school districts around the state of Texas on November 2nd, 2020 using the Freedom of Information Act (FOIA). Of those districts, 19 responded with the requested data, while other districts required purchase of the data due to the number of hours associated with labor. Due to ambiguity in the original message sent to districts, varying types of data were collected. The major difference between the data received was the “daily” records of student attendance and a “summary” of student attendance records so far, this academic school year. School districts took between 10 to 15 business days to respond, not including the holidays. The focus of this project is “daily student attendance” in order to find relationships or any influences from external or internal factors on any given school day. Therefore, of the 19 school districts that responded, 11 sent the appropriate data.
The 11 school districts that sent data were (1) Conroe ISD, (2) Cypress-Fairbanks ISD, (3) Floydada ISD, (4) Fort Worth ISD, (5) Pasadena ISD, (6) Snook ISD, (7) Socorro ISD, (8) Klein ISD, (9) Garland ISD, (10) Dallas ISD, and (11) Katy ISD. However, even within these datasets, there were discrepancies, that is, three school districts sent daily attendance data including student grade level but one school district did not include any other information. Also, of the 11 school districts, nine school districts included student attendance broken down by school while three other school districts only had student attendance with no other attributes. This information is important to explain certain steps in analysis preparation later. Variables used from school district datasets included (a) dates, (b) weekdays, (c) school name, (d) school type, (e) district, and (f) grade level.
In addition to daily student attendance data, two other datasets were used from the Texas Education Agency with data about each school and school district. In one dataset, “Current Schools”, information about each school in the state of Texas was given such as address, principal, county name, district number and much more as of May 2020. From this dataset, variables selected include (a) school name, (b) school zip, (3) district number, (4) and school type. In the second dataset, “District Type”, attributes of each school district were given such as whether the school district was considered major urban, independent town, or a rural area. From “District Type” dataset, selected variables used were (a) district, district number, Texas Education Agency (TEA) description, and National Center of Education Statistics (NCES). To determine if a county is metropolitan or non-metropolitan, a dataset from the Texas Health and Human Services was used. Selected variables from this dataset include (a) county name and (b) metro area.
Student attendance has been noticeably different this academic school year, therefore live COVID-19 data was attained from the New York Times to examine for any relationship. This dataset is updated daily with data being available in three formats (country, state, and county). From this dataset, variables selected were both COVID-19 cases by state, and by county.
Each school has a unique student population, therefore census data from 2018 (with best estimate of today’s current population) was used to find the makeup of the population surrounding a school by zip code. From the census data, variables selected were zip code, race/ethnicity, medium income, unemployment rate, and education. These variables were selected to determine differences between school attendance based on the makeup of the population surrounding the school.
Weather seems to have an impact on student attendance at schools, so weather data has been included based on county measures.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset provides a comprehensive overview of academic performance, student engagement, and sentiment trends collected from educational institutions in the Dallas-Fort Worth metropolitan area, including institutions such as Southern Methodist University, University of Texas at Arlington, Dallas College, and Texas Christian University. Covering the period from January 2018 to December 2024, the dataset includes hourly records that capture students' experiences across various programs and disciplines. Data sources include institutional records, learning management systems, and student feedback platforms, providing insights into academic achievements, student behaviors, and support services utilization.
The dataset aims to facilitate in-depth analysis of performance trends, engagement patterns, and the effectiveness of academic support services, making it suitable for educational data mining and predictive modeling. Prior to analysis, the data underwent preprocessing steps to handle missing values, mitigate outliers, and adjust for institutional variations, ensuring reliability and consistency.
Dataset Features Overview
S.No Features Description 1 Timestamp The date and time when the data was recorded, on an hourly basis. 2 Student_ID A unique identifier assigned to each student in the dataset. 3 Age The age of the student at the time of data collection. 4 Gender The gender of the student (encoded as binary or categorical values). 5 Ethnicity The ethnic background of the student, based on available demographic data. 6 SES Socioeconomic status indicator reflecting the student's background. 7 Location The geographic location where the data was collected, specifically in the Dallas-Fort Worth area. 8 Enrollment_Status Status indicating whether the student is enrolled full-time or part-time. 9 GPA Grade Point Average, representing the student's academic performance. 10 Attendance_Rate The rate at which the student attends classes, expressed as a percentage. 11 Study_Hours_per_Week The number of hours the student spends studying each week. 12 Extracurricular_Participation A score indicating the level of participation in extracurricular activities. 13 Course_Load The number of courses a student is taking during a given period. 14 Previous_Academic_Performance A historical indicator of the student's academic performance. 15 Course_Type The type of course (e.g., lecture, lab, seminar). 16 Instructor_Rating A rating reflecting the student's satisfaction with the instructor's teaching. 17 Learning_Style_Compatibility A score indicating how well the student's preferred learning style aligns with the course's format. 18 Career_Alignment_Indicator Measures the alignment between the course content and the student's career goals. 19 Library_Usage_Frequency The frequency with which the student accesses the library or online learning resources. 20 Study_Group_Participation Participation in study groups or collaborative learning activities. 21 Resource_Access_Score An indicator of the student's access to academic resources. 22 Peer_Interaction_Score A measure of the student's interaction with peers. 23 Stress_Indicator_Score A score reflecting the student's reported stress level. ... ... ... 43 Learning_Satisfaction_Level Indicator of the student's satisfaction with their learning experience. This dataset allows researchers and analysts to explore key factors affecting academic success, engagement, and satisfaction in a real-world educational environment.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Accounts-Payable Time Series for Equity Residential. Equity Residential is committed to creating communities where people thrive. The Company, a member of the S&P 500, owns and manages 318 rental properties consisting of 86,320 apartment units in dynamic metro areas across the U.S. with a primary concentration in major coastal markets, diversified by a targeted presence in the high-growth metro areas of Atlanta, Austin, Dallas/Ft. Worth and Denver.
Facebook
TwitterThe data asset is relational. There are four different data files. One represents customer information. A second contains address information. A third contains demographic data, and a fourth includes customer cancellation information. All of the data sets have linking ids, either ADDRESS_ID or CUSTOMER_ID. The ADDRESS_ID is specific to a postal service address. The CUSTOMER_ID is unique to a particular individual. Note that there can be multiple customers assigned to the same address. Also, note that not all customers have a match in the demographic table. The latitude-longitude information generally refers to the Dallas-Fort Worth Metroplex in North Texas and is mappable at a high level. Just be aware that if you drill down too far, some people may live in the middle of Jerry World, DFW Airport, or Lake Grapevine. Any lat/long pointing to a specific residence, business, or physical site is coincidental. The physical addresses are fake and are unrelated to the lat/long.
In the termination table, you can derive a binary (churn/did not churn) from the ACCT_SUSPD_DATE field. The data set is modelable. That is, you can use the other data in the data to predict who did and did not churn. The underlying logic behind the prediction should be consistent with predicting auto insurance churn in the real world.
Terms and Conditions Unless otherwise stated, the data on this site is free. It can be duplicated and used as you wish, but we'd appreciate and if you source it as coming from us.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Historical dataset of population level and growth rate for the Dallas-Fort Worth metro area from 1950 to 2025.