Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Dataset Card for HQ-EDIT
HQ-Edit, a high-quality instruction-based image editing dataset with total 197,350 edits. Unlike prior approaches relying on attribute guidance or human feedback on building datasets, we devise a scalable data collection pipeline leveraging advanced foundation models, namely GPT-4V and DALL-E 3. HQ-Edit’s high-resolution images, rich in detail and accompanied by comprehensive editing prompts, substantially enhance the capabilities of existing image editing… See the full description on the dataset page: https://huggingface.co/datasets/UCSC-VLAA/HQ-Edit-data-demo.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
OmniEdit
In this paper, we present OMNI-EDIT, which is an omnipotent editor to handle seven different image editing tasks with any aspect ratio seamlessly. Our contribution is in four folds: (1) OMNI-EDIT is trained by utilizing the supervision from seven different specialist models to ensure task coverage. (2) we utilize importance sampling based on the scores provided by large multimodal models (like GPT-4o) instead of CLIP-score to improve the data quality. 📃Paper | 🌐Website |… See the full description on the dataset page: https://huggingface.co/datasets/TIGER-Lab/OmniEdit-Filtered-1.2M.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Deekshitha Prabhakar
Released under Apache 2.0
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Advancement in base editors’ development enables all possible base-pair conversions. Currently, laborious one-by-one testing has been used to select or engineer the optimal variant for inducing a specific base substitution with maximal efficiency yet minimal undesired effects. This thesis work presents a high throughput activity profiling platform to streamline the evaluation process by enabling simultaneous performance assessment of a diverse pool of base editor variant in scale. This platform generates single-nucleotide resolution readouts, allowing quantitative measurements of each variant’s performance within a cytosine base editor library, including editing efficiency, substrate motif preference, positional biases and haplotype analysis. Undesired outcomes such as impure edits, indels and noncanonical base conversions are also uncovered during the process. This work further demonstrates the discovery power of this platform via a sgRNA scaffold library, identifying two scaffold variants, SV48 and SV240, that enhance base editing efficiency while maintaining an acceptable rate of inducing undesired edits. This work also explores the potential of integrating machine learning techniques to broaden the scope of engineering with the platform, which further lowers the experimental burden. By introducing slight modifications, this platform can be adapted for parallel engineering and screening of other precise genome editors such as adenine base editors and prime editors. With the continuously expanding repertoire of genome editing tools, this platform addresses the pressing need for scalable, unbiased, and rapid benchmarking of engineered variants. This would also accelerate the development of next-generation precise genome editors and pave the way for specialised editor design by optimising, profiling, and selecting the most suitable tools for specific applications.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Explore RNA editing through data from visualizations to datasets, all based on diverse sources.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Explore Digital video editing with Final Cut Express : the real-world guide to set up and workflow through data • Key facts: author, publication date, book publisher, book series, book subjects • Real-time news, visualizations and datasets
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The PE²rr corpus contains source language texts from different domains along with their automatically generated translations into several morphologically rich languages, their post-edited versions, and error annotations of the performed post-edit operations. The main advantage of the corpus is the fusion of post-editing and error classification tasks, which have usually been seen as two independent tasks, although naturally they are not.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for InstructCLIP-InstructPix2Pix-Data
The dataset can be used to train instruction-guided image editing models. It is built on top of InstructPix2Pix CLIP-filtered with new edit instructions. For each sample, source_image is the original image, instruction is the edit instruction, target_image is the edited image, and original_instruction is the edit instruction from the InstructPix2Pix CLIP-filtered dataset. Please refer to our repo to see how the edit instructions… See the full description on the dataset page: https://huggingface.co/datasets/SherryXTChen/InstructCLIP-InstructPix2Pix-Data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This layer features special areas of interest (AOIs) that have been contributed to Esri Community Maps using the new Community Maps Editor app. The data that is accepted by Esri will be included in selected Esri basemaps, including our suite of Esri Vector Basemaps, and made available through this layer to export and use offline. Export DataThe contributed data is also available for contributors and other users to export (or extract) and re-use for their own purposes. Users can export the full layer from the ArcGIS Online item details page by clicking the Export Data button and selecting one of the supported formats (e.g. shapefile, or file geodatabase (FGDB)). User can extract selected layers for an area of interest by opening in Map Viewer, clicking the Analysis button, viewing the Manage Data tools, and using the Extract Data tool. To display this data with proper symbology and metadata in ArcGIS Pro, you can download and use this layer file.Data UsageThe data contributed through the Community Maps Editor app is primarily intended for use in the Esri Basemaps. Esri staff will periodically (e.g. weekly) review the contents of the contributed data and either accept or reject the data for use in the basemaps. Accepted features will be added to the Esri basemaps in a subsequent update and will remain in the app for the contributor or others to edit over time. Rejected features will be removed from the app.Esri Community Maps Contributors and other ArcGIS Online users can download accepted features from this layer for their internal use or map publishing, subject to the terms of use below.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was analyzed and produced during the study described in the paper "Relating Wikipedia Article Quality to Edit Behavior and Link Structure" (under review, doi and link follows - see references). Its creation process and use cases are described in the dedicated paper.
For directions and code to process and evaluate this data, please see the corresponding GitHub repository: https://github.com/ruptho/editlinkquality-wikipedia.
We provide three files for 4941 Wikipedia articles (in .pkl format): The "article_revisions_labeled.pkl" file provides the final, semantically labeled revisions for each analyzed article per quality category. The "article_revision_features.zip" file contains processed per-article features, divided into folders for the specific quality categories they belong to. In "article_revision_features_raw.zip", we provide the raw features as retrieved via RevScoring API (https://pythonhosted.org/revscoring/).
Supplementary File 1A text file in a fasta format with the constructed squid coding sequences.EisenbergSI_Data1.txtSupplementary File 2A spreadsheet with all the A-to-G modification sites detected in the coding regions of the squid, along with their number of supporting reads in all the tissues studied.EisenbergSI_Table1.xlsx
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data files related to GeCKOv2 data set as used in JACKS 2018 paper
This dataset contains files of protein sequences, DNA sequences, RNA sequences, sequencing files, brain slice images, and flow analysis, used for the paper.
https://doi.org/10.5061/dryad.mkkwh716m
The file entitled with “Protein_and_RNA_sequences” contains the RNA and protein sequences in the study. Files entitled with “CL7_NLS_Cas9_A22p3-His”, “His_CL7_2NLS_iCas12a-2NLS”, “His_CL7_2NLS_iCas12a-4NLS”, “His_CL7_NLS_Cas9(dS)*A22p3”, “His_CL7_NLS_Cas9_A22p”, “His_CL7_NLS_Cas9_A22p3” are the snapgene files for the plasmid sequences in the study. File entitled with “mGluR5-TH-NPC_and_mouse_brain_editing-NGS_data” contains the unprocessed next-generation sequencing data in the study. File entitled with “Examples_of_flow_analysis” contains examples of flow analysis in the study. Files entitled with “Brain_imaging*(tdTomato_Ai9)” and “Ai9_-_co-delivery_of_tdTom_a...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This event has been computationally inferred from an event that has been demonstrated in another species.
The inference is based on the homology mapping from PANTHER. Briefly, reactions for which all involved PhysicalEntities (in input, output and catalyst) have a mapped orthologue/paralogue (for complexes at least 75% of components must have a mapping) are inferred to the other species. High level events are also inferred for these events to allow for easier navigation.
More details and caveats of the event inference in Reactome. For details on PANTHER see also: http://www.pantherdb.org/about.jsp
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Amount (GBP) given to images of each of the described species in our study by each participant. Amounts are Hypothetical, i.e. no real funds were given. Participants were shown the images by groupings of three (in order as shown in the data) and asked to allocated 5000 GBP between each of the three animals in these images
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/P1VECEhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/P1VECE
This dataset contains data and software for the following paper: Hill, Benjamin Mako and Shaw, Aaron. (2015) “Page Protection: Another Missing Dimension of Wikipedia Research.” In Proceedings of the 11th International Symposium on Open Collaboration (OpenSym 2015). ACM Press. doi: 10.1145/2788993.2789846 This is an archival version of the data and software released with the paper. All of these data were (and, at the time of writing, continue to be) hosted at: https://communitydata.cc/wiki-proetection/ Page protection is a feature of MediaWiki software that allows administrators to restrict contributions to particular pages. For example, a page can be “protected” so that only administrators or logged-in editors with a history of good editing can edit, move, or create it. Protection might involve “full protection” where a page can only be edited by administrators (i.e., “sysops”) or “semi-protection” where a page can only be edited by accounts with a history of good edits (i.e., “autoconfirmed” users). Although largely hidden, page protection profoundly shapes activity on the site. For example, page protection is an important tool used to manage access and participation in situations where vandalism or interpersonal conflict can threaten to undermine content quality. While protection affects only a small portion of pages in English Wikipedia, many of the most highly viewed pages are protected. For example, the “Main Page” in English Wikipedia has been protected since February, 2006 and all Featured Articles are protected at the time they appear on the site’s main page. Millions of viewers may never edit Wikipedia because they never see an edit button. Despite it's widespread and influential nature, very little quantitative research on Wikipedia has taken page protection into account systematically. This page contains software and data to help Wikipedia researchers do exactly this in their work. Because a page's protection status can change over time, the snapshots of page protection data stored by Wikimedia and published by Wikimedia Foundation in as dumps is incomplete. As a result, taking protection into account involves looking at several different sources of data. Much more detail can be found in our paper Page Protection: Another Missing Dimension of Wikipedia Research. If you use this software or these data, we would appreciate if you cite the paper.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Using a commonly used synthetic dataset and one real-world dataset, we investigate the potential and gaps of three state-of-the-art neural code editors (Graph2Edit, Hoppity, SequenceR) for DL-based realistic vulnerability data generation, and two state-of-the-art vulnerability detectors (Devign, ReVeal) to evaluate the effectiveness of the generated realistic vulnerability data.
Once the users have Docker installed download the Docker image "neural_editors_vulgen_docker.tar.xz".
Then, check the README.md for detailed steps of reproducing the experiments.
Besides, we also provide the simple package of the artifact "neural_editors_vulgen.zip". The raw data of our experiments is also provided in this simple package. However, using it to reproduce the experiments requires the users to set up the enviroments and dependencies for all the five tools, which is not recommanded.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Colombia ASS: Editing Activities: Gross Operating Surplus data was reported at 354,757,993.000 COP th in 2017. This records a decrease from the previous number of 463,021,967.000 COP th for 2016. Colombia ASS: Editing Activities: Gross Operating Surplus data is updated yearly, averaging 408,889,980.000 COP th from Dec 2016 (Median) to 2017, with 2 observations. The data reached an all-time high of 463,021,967.000 COP th in 2016 and a record low of 354,757,993.000 COP th in 2017. Colombia ASS: Editing Activities: Gross Operating Surplus data remains active status in CEIC and is reported by National Statistics Administrative Department. The data is categorized under Global Database’s Colombia – Table CO.H014: Annual Service Survey.
We engineered circular ADAR recruiting guide RNAs (cadRNAs) that efficiently recruit endogenous ADARs to edit specific sites on target RNA We tested two Circular ADAR recruiting guide RNAs (cadRNAs). "Circular 100,50" refers to a cadRNA targeting the RAB7A transcript with an antisense domain of length 100 bp and containing a centrally located C-mismatch (position 50). Similarly, "Circular 200,100" refers to a cadRNA targeting the RAB7A transcript with an antisense domain of length 200 bp and containing a centrally located C-mismatch (position 100). "293FT" refers to the control sample, i.e., HEK-293FT cells that were not treated with any cadRNA.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
The Marine Environment Classification (MEC), a GIS-based environmental classification of the marine environment of the New Zealand region, is an ecosystem-based spatial framework designed for marine management purposes. Developed by NIWA with support from the Ministry for the Environment (MfE), Department of Conservation and Ministry of Fisheries, and with contributions from several other stakeholders, the MEC provides a spatial framework for inventories of marine resources, environmental effects assessments, policy development and design of protected area networks. Two levels of spatial resolution are available within the MEC. A broad scale classification covers the entire EEZ at a nominal spatial resolution of 1 km, whereas the finer scale classification of the Hauraki Gulf region has a nominal spatial resolution of 200 m. Several spatially-explicit data layers describing the physical environment define the MEC. A physically-based classification was chosen because data on these physical variables were available or could be modelled, and because the pattern of the physical environment is a reasonable surrogate for biological pattern, particularly at larger spatial scales. Classes within the classification were defined using multivariate clustering methods. These produce hierarchal classifications that enable the user to delineate environmental variation at different levels of detail and associated spatial scales. Large biological datasets were used to tune the classification, so that the physically-based classes maximised discrimination of variation in biological composition at various levels of classification detail. Thus, the MEC provides a general classification that is relevant to most groups of marine organisms (fishes, invertebrates and chlorophyll) and to ecologically important abiotic variables (e.g., temperature, nutrients).An overview report describing the MEC is available as a PDF file (External Link). The overview report covers the conceptual basis for the MEC and results of testing the classification: MEC Overview (PDF 2.7 MB)See here for a longer description: https://www.niwa.co.nz/coasts-and-oceans/our-services/marine-environment-classification_Item Page Created: 2018-11-12 22:47 Item Page Last Modified: 2025-04-05 20:20Owner: NIWA_OpenDataExclusive Economic Zone (EEZ)No data edit dates availableFields: FID,ENTITY,LAYER,ELEVATION,THICKNESS,COLORMEC EEZ 40 classNo data edit dates availableFields: FID,GRP_40,COUNT_MEC EEZ 20 classNo data edit dates availableFields: FID,GRP_20,COUNT_MEC EEZ 10 classNo data edit dates availableFields: FID,GRP_10,COUNT_MEC EEZ 05 classNo data edit dates availableFields: FID,GRP_5,COUNT_CoastlineNo data edit dates availableFields: FID,NZCOAST_ID,SHAPE_LENG
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Dataset Card for HQ-EDIT
HQ-Edit, a high-quality instruction-based image editing dataset with total 197,350 edits. Unlike prior approaches relying on attribute guidance or human feedback on building datasets, we devise a scalable data collection pipeline leveraging advanced foundation models, namely GPT-4V and DALL-E 3. HQ-Edit’s high-resolution images, rich in detail and accompanied by comprehensive editing prompts, substantially enhance the capabilities of existing image editing… See the full description on the dataset page: https://huggingface.co/datasets/UCSC-VLAA/HQ-Edit-data-demo.