Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
using an idea based on iterative gradient ascent. In this paper we develop a mean-shift-inspired algorithm to estimate the modes of regression functions and partition the sample points in the input space. We prove convergence of the sequences generated by the algorithm and derive the non-asymptotic rates of convergence of the estimated local modes for the underlying regression model.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
MAPE and PB statistics for IBFI compared with other imputation methods (mean, median, mode, PMM, and Hotdeck) for 20% missingness of type MAR and all parameters tested (RN, TH, TC, RH, and PR).
We propose a unique addition to the HerMES family of maps, the HerMES Large Mode Survey (HeLMS), which will be observed as part of the remaining SPIRE GT time. The field covers 270 deg2 with two repeats over roughly 106 hours. It is chosen to overlap the SDSS Stripe 82 in a region where the Galactic cirrus with a mean flux density of 1.2 MJy sr1 is at its lowest. The ancillary coverage in this field is extensive, particularly in the optical, making it an attractive large field. The scientific advantages of having a field with the size and makeup of HeLMS are twofold: first, the large survey area, of which 55 deg2 overlaps Stripe 82, will allow us to resolve approximately 250, 000 sources, of which 250 500 will be gravitationally lensed. Second, the large, uniform and crosslinked area will allow for unprecedented measurement of largescale modes (down to l 150). Such fidelity would be a unique asset in the submillimeter regime for the foreseeable future, and would enable crossfrequency correlation analysis with 80 deg2 of deep ACT maps, Planck over the full 270 deg2, as well as future observations with upcoming instruments like ALMA, ACTpol, and others. truncated!, Please see actual data for full text [truncated!, Please see actual data for full text]
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Statistical mirroring is the measure of the proximity or deviation of transformed data points from a specified location estimate within a given distribution [2]. Within the framework of Kabirian-based optinalysis [1], statistical mirroring is conceptualized as the isoreflectivity of the transformed data points to a defined statistical mirror. This statistical mirror is an amplified location estimate of the distribution, achieved through a specified size or length. The location estimate may include parameters such as the mean, median, mode, maximum, minimum, or reference value [2]. The process of statistical mirroring comprises two distinct phases: a) Preprocessing phase [2]: This involves applying preprocessing transformations, such as compulsory theoretical ordering, with or without centering the data. It also encompasses tasks like statistical mirror design and optimizations within the established optinalytic construction. These optimizations include selecting an efficient pairing style, central normalization, and establishing an isoreflective pair between the preprocessed data and its designed statistical mirror. b) Optinalytic model calculation phase [1]: This phase is focused on computing estimates based on Kabirian-based isomorphic optinalysis models.
References: [1] K.B. Abdullahi, Kabirian-based optinalysis: A conceptually grounded framework for symmetry/asymmetry, similarity/dissimilarity, and identity/unidentity estimations in mathematical structures and biological sequences, MethodsX 11 (2023) 102400. https://doi.org/10.1016/j.mex.2023.102400 [2] K.B. Abdullahi, Statistical mirroring: A robust method for statistical dispersion estimation, MethodsX 12 (2024) 102682. https://doi.org/10.1016/j.mex.2024.102682
The sea-ice thickness distribution (ITD) is fundamental to understanding the heat, mass and freshwater budgets of the Arctic Ocean, plays important roles in climate feedbacks, and exerts strong control on regional primary productivity, marine navigation and surface operations. Estimates of ITD based on observations in the perennial ice zone (PIZ) nearly always exhibit a prominent mode at some thickness, hm, larger than and distinct from modal thicknesses associated with the annual freezeup and recently-opened leads. This mode has been interpreted as un-deformed multi-year ice at, or growing thermodynamically towards, its equilibrium thickness. Observations indicate significant inter-regional, annual and inter-annual variations of hm within the PIZ, over the approximate range 1 to 5 meters. This statistic of the ITD has received much less attention than other parameters, such as the mean, standard deviation and e-folding length scale of ridged ice. The results of this project will improve understanding of the variations in space and time of this ecologically and societally important measure of ice thickness. The PI hypothesizes that measurements of ice of near-modal thickness have reduced bias and random error with respect to other measures of ice thickness, thus making the mode a robust parameter of the ITD. The goal of the proposed research is to enhance understanding of the ITD, focusing on the mode hm, by producing improved estimates of space/ time variations of the ITD, and greater understanding of its role in Arctic climate. Specific objectives are (a) to test the hypothesis that the mode is robust to observational errors; (b) to document space and time variation of hm and other parameters of the ITD in the PIZ; (c) to simulate the variations of the ITD and hm as a response to prescribed forcing and compare results with observations; (d) to study the dependence of simulated ITDs on the formulation of ridging and on errors in the forcing data; and (e) to synthesize the results as a new analysis and interpretation of the ITD emphasizing hm and its variations. The study domain is the Arctic Ocean PIZ during the period 1975-2010. The timescales to be considered include inter-seasonal, inter-annual and trends, as well as physical processes acting on hours and longer. The goal of this research is to enhance understanding of the sea-ice thickness distribution (ITD), focusing on the mode, by producing improved estimates of space and time variations of the ITD, and greater understanding of its role in Arctic climate. Project objectives are (a) test the hypothesis that the mode is more robust (than the mean value) to observational errors (b) document space and time variation of the mode and other parameters of the ITD; (c) simulate the variations of the ITD as a response to prescribed forcing, and compare results with observations; (d) study the dependence of simulated ITD on the formulation of ridging, and on errors in the forcing data; (e)synthesize the results as a new analysis and interpretation of the ITD emphasizing the mode and its variations. Observations acquired by moored and submarine-mounted upward-looking sonars, satellite-borne laser altimeter, and aircraft-mounted electromagnetic induction sensors will be analyzed. A...
Mean, SD, median, interquartiles, mode, skewness and excess of time for each nationality of the truncated dataset.
By VISHWANATH SESHAGIRI [source]
This dataset contains YouTube video and channel metadata to analyze the statistical relation between videos and form a topic tree. With 9 direct features, 13 more indirect features, it has all that you need to build a deep understanding of how videos are related – including information like total views per unit time, channel views, likes/subscribers ratio, comments/views ratio, dislikes/subscribers ratio etc. This data provides us with a unique opportunity to gain insights on topics such as subscriber count trends over time or calculating the impact of trends on subscriber engagement. We can develop powerful models that show us how different types of content drive viewership and identify the most popular styles or topics within YouTube's vast catalogue. Additionally this data offers an intriguing look into consumer behaviour as we can explore what drives people to watch specific videos at certain times or appreciate certain channels more than others - by analyzing things like likes per subscribers and dislikes per views ratios for example! Finally this dataset is completely open source with an easy-to-understand Github repo making it an invaluable resource for anyone looking to gain better insights into how their audience interacts with their content and how they might improve it in the future
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
How to Use This Dataset
In general, it is important to understand each parameter in the data set before proceeding with analysis. The parameters included are totalviews/channelelapsedtime, channelViewCount, likes/subscriber, views/subscribers, subscriberCounts, dislikes/views comments/subscriberchannelCommentCounts,, likes/dislikes comments/views dislikes/ subscribers totviewes /totsubsvews /elapsedtime.
To use this dataset for your own analysis:1) Review each parameter’s meaning and purpose in our dataset; 2) Get familiar with basic descriptive statistics such as mean median mode range; 3) Create visualizations or tables based on subsets of our data; 4) Understand correlations between different sets of variables or parameters; 5) Generate meaningful conclusions about specific channels or topics based on organized graph hierarchies or tables.; 6) Analyze trends over time for individual parameters as well as an aggregate reaction from all users when videos are released
Predicting the Relative Popularity of Videos: This dataset can be used to build a statistical model that can predict the relative popularity of videos based on various factors such as total views, channel viewers, likes/dislikes ratio, and comments/views ratio. This model could then be used to make recommendations and predict which videos are likely to become popular or go viral.
Creating Topic Trees: The dataset can also be used to create topic trees or taxonomies by analyzing the content of videos and looking at what topics they cover. For example, one could analyze the most popular YouTube channels in a specific subject area, group together those that discuss similar topics, and then build an organized tree structure around those topics in order to better understand viewer interests in that area.
Viewer Engagement Analysis: This dataset could also be used for viewer engagement analysis purposes by analyzing factors such as subscriber count, average time spent watching a video per user (elapsed time), comments made per view etc., so as to gain insights into how engaged viewers are with specific content or channels on YouTube. From this information it would be possible to optimize content strategy accordingly in order improve overall engagement rates across various types of video content and channel types
If you use this dataset in your research, please credit the original authors.
License
Unknown License - Please check the dataset description for more information.
File: YouTubeDataset_withChannelElapsed.csv | Column name | Description | |:----------------------------------|:-------------------------------------------------------| | totalviews/channelelapsedtime | Ratio of total views to channel elapsed time. (Ratio) | | channelViewCount | Total number of views for the channel. (Integer) | | likes/subscriber ...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This study aimed to explore the activity and functional connectivity of the default mode network (DMN) during meaning-making, and its mediating role of the relationship between stressful events and stress-related growth.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This section presents a discussion of the research data. The data was received as secondary data however, it was originally collected using the time study techniques. Data validation is a crucial step in the data analysis process to ensure that the data is accurate, complete, and reliable. Descriptive statistics was used to validate the data. The mean, mode, standard deviation, variance and range determined provides a summary of the data distribution and assists in identifying outliers or unusual patterns. The data presented in the dataset show the measures of central tendency which includes the mean, median and the mode. The mean signifies the average value of each of the factors presented in the tables. This is the balance point of the dataset, the typical value and behaviour of the dataset. The median is the middle value of the dataset for each of the factors presented. This is the point where the dataset is divided into two parts, half of the values lie below this value and the other half lie above this value. This is important for skewed distributions. The mode shows the most common value in the dataset. It was used to describe the most typical observation. These values are important as they describe the central value around which the data is distributed. The mean, mode and median give an indication of a skewed distribution as they are not similar nor are they close to one another. In the dataset, the results and discussion of the results is also presented. This section focuses on the customisation of the DMAIC (Define, Measure, Analyse, Improve, Control) framework to address the specific concerns outlined in the problem statement. To gain a comprehensive understanding of the current process, value stream mapping was employed, which is further enhanced by measuring the factors that contribute to inefficiencies. These factors are then analysed and ranked based on their impact, utilising factor analysis. To mitigate the impact of the most influential factor on project inefficiencies, a solution is proposed using the EOQ (Economic Order Quantity) model. The implementation of the 'CiteOps' software facilitates improved scheduling, monitoring, and task delegation in the construction project through digitalisation. Furthermore, project progress and efficiency are monitored remotely and in real time. In summary, the DMAIC framework was tailored to suit the requirements of the specific project, incorporating techniques from inventory management, project management, and statistics to effectively minimise inefficiencies within the construction project.
In the Sun, the frequencies of the acoustic modes are observed to vary in phase with the magnetic activity level. These frequency variations are expected to be common in solar-type stars and contain information about the activity-related changes that take place in their interiors. The unprecedented duration of Kepler photometric time-series provides a unique opportunity to detect and characterize stellar magnetic cycles through asteroseismology. In this work, we analyze a sample of 87 solar-type stars, measuring their temporal frequency shifts over segments of 90 days. For each segment, the individual frequencies are obtained through a Bayesian peak-bagging tool. The mean frequency shifts are then computed and compared with: (1) those obtained from a cross-correlation method; (2) the variation in the mode heights; (3) a photometric activity proxy; and (4) the characteristic timescale of the granulation. For each star and 90-day sub-series, we provide mean frequency shifts, mode heights, and characteristic timescales of the granulation. Interestingly, more than 60% of the stars show evidence for (quasi-)periodic variations in the frequency shifts. In the majority of the cases, these variations are accompanied by variations in other activity proxies. About 20% of the stars show mode frequencies and heights varying approximately in phase, in opposition to what is observed for the Sun.
We drilled three sites (Sites 1071, 1072, and 1073) on the New Jersey shelf and slope at water depths between 88 and 664 m. Grain-size analyses from shelf sites (Sites 1071 and 1072) define five types of sediment: well-sorted fine sand, silty sand or sandy silt, clayey silt, poorly sorted sandy mud, and poorly sorted lag sediments. At slope Site 1073, a grain-size minimum of 3-6 µm is found at 300 meters below seafloor. These sediments are well sorted and lack sand- and clay-sized grains. Horizons of coarse-grained sediments are present in Unit I at Site 1073.
Multicellular complexity is a central topic in biology, but the evolutionary processes underlying its origin are difficult to study and remain poorly understood. Here we use experimental evolution to investigate the tempo and mode of multicellular adaptation during a de novo evolutionary transition to multicellularity. Multicelled “snowflake†yeast evolved from a unicellular ancestor after 7 days of selection for faster settling through liquid media. Over the next 220 days, snowflake yeast evolved to settle 44% more quickly. Throughout the experiment the clusters evolved faster settling by three distinct modes. The number of cells per cluster increased from a mean of 42 cells after 7 days of selection to 114 cells after 227 days. Between days 28 and 65, larger clusters evolved via a twofold increase in the mass of individual cells. By day 227, snowflake yeast evolved to form more hydrodynamic clusters that settle more quickly for their size than ancestral strains. The timing and nature ...
Mean ± standard deviation (SD), minimum (Min) and maximum (Max) values of repeated measurements of two-dimensional (n = 12), M-mode (n = 3) and anatomic M-mode (n = 3) echocardiographic variables (n = 18) obtained by a trained observer in 4 Borneo orangutans (Pongo pygmaeus pygmaeus) from 96 transthoracic examinations, BO1 to BO3 being the females and BO4 the male.
Siliciclastic sedimentation at Ocean Drilling Program Site 1017 on the southern slope of the Santa Lucia Bank, central California margin, responded closely to oceanographic and climatic change over the past ~130 ka. Variation in mean grain-size and sediment sorting within the ~25-m-thick succession from Hole 1017E show Milankovitch-band to submillenial-scale variation. Mean grain size of the "sortable silt" fraction (10-63 µm) ranges from 17.6 to 33.9 µm (average 24.8 µm) and is inversely correlated with the degree of sorting. Much of the sediment has a bimodal or trimodal grain-size distribution that is composed of distinct fine silt, coarse silt to fine sand, and clay-size components. The position of the mode and the sorting of each component changes through the succession, but the primary variation is in the presence or abundance of the coarse silt fraction that controls the overall mean grain size and sorting of the sample. The occurrence of the best-sorted, finest grained sediment at high stands of sea level (Holocene, marine isotope Substages 5c and 5e) reflect the linkage between global climate and the sedimentary record at Site 1017 and suggest that the efficiency of off-shelf transport is a key control of sedimentation on the Santa Lucia Slope. It is not clear what proportion of the variation in grain size and sorting may also be caused by variations in bottom current strength and in situ hydrodynamic sorting.
The file shows the ground water level in the province of North Brabant. Groundwater stage is based on the mode (meaning the most common class of the groundwater stage and the realizations from which it is calculated). It gives a classification/indication of the average highest groundwater level (GHG) and the average lowest groundwater level (GLG).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The set of local modes and density ridge lines are important summary characteristics of the data-generating distribution. In this work, we focus on estimating local modes and density ridges from point cloud data in a product space combining two or more Euclidean and/or directional metric spaces. Specifically, our approach extends the (subspace constrained) mean shift algorithm to such product spaces, addressing potential challenges in the generalization process. We establish the algorithmic convergence of the proposed methods, along with practical implementation guidelines. Experiments on simulated and real-world datasets demonstrate the effectiveness of our proposed methods. Supplementary materials for this article are available online.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
R Core Team. (2016). R: A language and environment for statistical computing. R Foundation for Statistical Computing.
Supplement to Occipital and left temporal instantaneous amplitude and frequency oscillations correlated with access and phenomenal consciousness (https://philpapers.org/rec/PEROAL-2).
Occipital and left temporal instantaneous amplitude and frequency oscillations correlated with access and phenomenal consciousness move from the features of the ERP characterized in Occipital and Left Temporal EEG Correlates of Phenomenal Consciousness (Pereira, 2015, https://doi.org/10.1016/b978-0-12-802508-6.00018-1, https://philpapers.org/rec/PEROAL) towards the instantaneous amplitude and frequency of event-related changes correlated with a contrast in access and in phenomenology.
Occipital and left temporal instantaneous amplitude and frequency oscillations correlated with access and phenomenal consciousness proceed as following.
In the first section, empirical mode decomposition (EMD) with post processing (Xie, G., Guo, Y., Tong, S., and Ma, L., 2014. Calculate excess mortality during heatwaves using Hilbert-Huang transform algorithm. BMC medical research methodology, 14, 35) Ensemble Empirical Mode Decomposition (postEEMD) and Hilbert-Huang Transform (HHT).
In the second section, calculated the variance inflation factor (VIF).
In the third section, partial least squares regression (PLSR): the minimal root mean squared error of prediction (RMSEP).
In the last section, partial least squares regression (PLSR): significance multivariate correlation (sMC) statistic.
The EFW Burst Modes provide targeted Measurements over Brief Time Intervals of 3-D Electric Fields, 3-D Wave Magnetic Fields, and Spacecraft Potential. There are two EFW Burst Modes: BURST1 (B1), Medium-Rate 512 samples/s nominal and BURST2 (B2), Higher-Rate, 16384 samples/s nominal. The Burst 1 Mode Data includes three components of the Electric Field (E12_B1, E34_B1, E56_B1), six Components of the Spacecraft-Sensor Potential (V1_B1 through V6_B1), and three Components of the AC Magnetic Field (SCM_U_B1, SCM_V_B1, SCM_W_B1 from the EMFISIS Search Coil Magnetometer). The Burst 2 Mode Data returns a similar Complement of Electric Field (E_12ac_B2, E34ac_B2, E56ac_B2), Search Coil (SCM_B2, SCM_2B2, and SCM_B2 again from the EMFISIS Search Coil Magnetometer) and Single-ended Potential Measurements (V1ac_B1 through V6ac_B2) with the exception that in the Default Mode the Single-ended Potential and Electric Field Signals are AC coupled with a higher Gain. All Quantities are in "uvw" Coordinates where "u" and "v" are the Sensor Coordinates rotating with the Spacecraft and "w" points along the Spacecraft Spin Axis. Burst Waveform CDF Files are available. The three Data Types available are the Electric Field "E" the Searchcoil Magnetic Field "MSC" and Antenna Potential "V". The Suffix on each of these is either "B1" or "B2". B1 or Burst 1 is the Human-in-the-Loop Burst Type, meaning that both Collection and Playback (for arbitrary Lengths of Time) are requested on the Ground. B2 or Burst 2 is automatically telemetered as short Bursts based on an onboard Triggering Algorithm, typically set to trigger on large Amplitude Signals near 1 kHz. Sample Rates for B1 and B2 can and are changed depending on varying Science Goals.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time series data for the statistic Passenger volume by mode (billion tons-km) - Rail transport and country Marshall Islands. Indicator Definition:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Posterior inferences for mode, mean, median, and higher posterior density (HPD) interval of the broad-sense heritability per plot, considering the proposed Bayesian single-trait multi-environment (BSTME) and multi-trait multi-environment (BMTME) models for number of days to maturity (DM), 100-seed weight (SW) (grams), and average seed yield per plot (SY) (grams).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
using an idea based on iterative gradient ascent. In this paper we develop a mean-shift-inspired algorithm to estimate the modes of regression functions and partition the sample points in the input space. We prove convergence of the sequences generated by the algorithm and derive the non-asymptotic rates of convergence of the estimated local modes for the underlying regression model.