Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
LAS&T is the largest and most diverse dataset for shape, texture and material recognition and retrieval in 2D and 3D with 650,000 images, based on real world shapes and textures.
The LAS&T Dataset aims to test the most basic aspect of vision in the most general way. Mainly the ability to identify any shape, texture, and material in any setting and environment, without being limited to specific types or classes of objects, materials, and environments. For shapes, this means identifying and retrieving any shape in 2D or 3D with every element of the shape changed between images, including the material and texture, orientation, size, and environment. For textures and materials, the goal is to recognize the same texture or material when appearing on different objects, environments, and light conditions. The dataset relies on shapes, textures, and materials extracted from real-world images, leading to an almost unlimited quantity and diversity of real-world natural patterns. Each section of the dataset (shapes, and textures), contains 3D parts that rely on physics-based scenes with realistic light materials and object simulation and abstract 2D parts. In addition, a real-world images benchmark for 3D shapes recognition is also supplied.
3D shape recognition and retrieval.
2D shape recognition and retrieval.
3D Materials recognition and retrieval.
2D Texture recognition and retrieval.
Each can be used independently for training and testing.
Additional assets are a set of 350,000 natural 2D shapes extracted from real-world images (SHAPES_COLLECTION_350k.zip)
3D shape recognition real-world images benchmark
The scripts used to generate and test the dataset are supplied as in SCRIPT** files.
For shape recognition the goal is to identify the same shape in different images, where the material/texture/color of the shape is changed, the shape is rotated, and the background is replaced. Hence, only the shape remains the same in both images. All files with 3D shapes contain samples of the 3D shape dataset. This is tested for 3D shapes/objects with realistic light simulation. All files with 2D shapes contain samples of the 2D shape dataset. Examples files contain images with examples for each set.
3D_Shape_Recognition_Synthethic_GENERAL_LARGE_SET_76k.zip A Large number of synthetic examples 3D shapes with max variability can be used for training/testing 3D shape/objects recognition/retrieval.
2D_Shapes_Recognition_Textured_Synthetic_Resize2_GENERAL_LARGE_SET_61k.zip A Large number of synthetic examples for 2D shapes with max variability can be used for training/testing 2D shape recognition/retrieval.
SHAPES_2D_365k.zip 365,000 2D shapes extracted from real-world images saved as black and white .png image files.
All jpg images that are in the exact same subfolder contain the exact same shape (but with different texture/color/background/orientation).
For texture and materials, the goal is to identify and match images containing the same material or textures, however the shape/object on which the material texture is applied is different, and so is the background and light.
This is done for physics-based material in 3D and abstract 2D textures.
3D_Materials_PBR_Synthetic_GENERAL_LARGE_SET_80K.zip A Large number of examples of 3D materials in physics grounded can be used for training or testing of material recognition/retrieval.
2D_Textures_Recogition_GENERAL_LARGE_SET_Synthetic_53K.zip
Large number of images of 2D texture in maximum variability of setting can be used for training/testing 2D textured recognition/retrieval.
All jpg images that are in the exact same subfolder contain the exact same texture/material (but overlay on different objects with different background/and illumination/orientation).
The images in the synthetic part of the dataset were created by automatically extracting shapes and textures from natural images and combining them in synthetic images. This created synthetic images that completely rely on real-world patterns, making extremely diverse and complex shapes and textures. As far as we know this is the largest and most diverse shape and texture recognition/retrieval dataset. 3D data was generated using physics-based material and rendering (blender) making the images physically grounded and enabling using the data to train for real-world examples. The scripts for generating the data are supplied in files with the world SCRIPTS* in them.
For 3D shape recognition and retrieval, we also supply a real-world natural image benchmark. With a variety of natural images containing the exact same 3D shape but made/coated with different materials and in different environments and orientations. The goal is again to identify the same shape in different images. The benchmark is available at: Real_Images_3D_shape_matching_Benchmarks.zip
Files containing the word 'GENERAL_LARGE_SET' contains synthetic images that can be used for training or testing, the type of data (2D shapes, 3D shapes, 2D textures, 3D materials) that appears in the file name, as well as the number of images. Files containing MultiTests contain a number of different tests in which only a single aspect of the aspect of the instance is changed (for example only the background.) File containing "SCRIPTS" contain data generation testing scripts. Images containing "examples" are example of each test.
The file SHAPES_COLLECTION_350k.zip contains 350,000 2D shapes extracted from natural images and used for the dataset generation.
For evaluating and testing see: SCRIPTS_Testing_LVLM_ON_LAST_VQA.zip
This can be use to test leading LVLMs using api, create human tests, and in general turn the dataset into multichoice question images similar to the one in the paper.
Accepted by NeurIPS 2024 Datasets and Benchmarks Track
We introduce the RePair puzzle-solving dataset, a large-scale real world dataset of fractured frescoes from the archaelogical campus of Pompeii. Our dataset consists of over 1000 fractured frescoes. The RePAIR stands as a realistic computational challenge for methods for 2D and 3D puzzle solving, and serves as a benchmark that enables the study of fractured object reassembly and presents new challenges for geometric shape understanding. Please visit our website for more dataset information, access to source code scripts and for an interactive gallery viewing of the dataset samples.
Access the entire dataset
We provide a compressed version of our dataset in two seperate files. One for the 2D version and one for the 3D version.
Our full dataset contains over one thousand individual fractured fragments divided into groups with its corresponding folder and all compressed into their individual sub-set format regarding whether they are 2D or 3D. Regarding the 2D dataset, each fragment is saved as a .PNG image and each group has the corresponding ground truth transformation to solve the puzzle as a .TXT file. Considering the 3D dataset, each fragment is saved as a mesh using the widely .OBJ format with the corresponding material (.MTL) and texture (.PNG) file. The meshes are already in the assembled position and orientation, so that no additional information is needed. All additional metadata information are given as .JSON files.
Important Note
Please be advised that downloading and reusing this dataset is permitted only upon acceptance of the following license terms.
The Istituto Italiano di Tecnologia (IIT) declares, and the user (“User”) acknowledges, that the "RePAIR puzzle-solving dataset" contains 3D scans, texture maps, rendered images and meta-data of fresco fragments acquired at the Archaeological Site of Pompeii. IIT is authorised to publish the RePAIR puzzle-solving dataset herein only for scientific and cultural purposes and in connection with an academic publication referenced as Tsemelis et al., "Re-assembling the past: The RePAIR dataset and benchmark for real world 2D and 3D puzzle solving", NeurIPS 2024. Use of the RePAIR puzzle-solving dataset by User is limited to downloading, viewing such images; comparing these with data or content in other datasets. User is not authorised to use, in particular explicitly excluding any commercial use nor in conjunction with the promotion of a commercial enterprise and/or its product(s) or service(s), reproduce, copy, distribute the RePAIR puzzle-solving dataset. User will not use the RePAIR puzzle-solving dataset in any way prohibited by applicable laws. RePAIR puzzle-solving dataset therein is being provided to User without warranty of any kind, either expressed or implied. User will be solely responsible for their use of such RePAIR puzzle-solving dataset. In no event shall IIT be liable for any damages arising from such use.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is built to reconstruct 3D CAD models from 2D drawings and is based on the public dataset Fusion 360 gallery. The dataset includes two parts that are the original data and the reconstructed data (in folders '/original_data" and '/reconstructed'). The part of the original data contains the '.svg' files of 2D drawings and the '.step' files of CAD models. Our reconstruction results are shown in the second part (folder '/reconstructed'), which includes the reconstructed 3D wireframes, 3D shapes with faces (storage in FreeCAD files '.FCStd'), and the images of reconstructed models (screenshot). We also test some cases from the ABC dataset, shown in the 2_ABC folder.
Please cite our paper if you use the dataset.
@article{zhang2023automatic, title={Automatic 3D CAD models reconstruction from 2D orthographic drawings}, author={Zhang, Chao and Pinqui{\'e}, Romain and Polette, Arnaud and Carasi, Gregorio and De Charnace, Henri and Pernot, Jean-Philippe}, journal={Computers \& Graphics}, year={2023}, publisher={Elsevier} }
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Geoscience Australia is releasing its 2014 version of the Marine Seismic Surveys Shape and Kml files. These files have been updated to include recent openfile surveys. The spatial files have been created from a cleansed, updated collection of p190 navigation files. This navigation collection has grown from the checking of navigation submitted to the GA Repository under the Offshore Petroleum and Greenhouse Gas Storage Regulations, checking of the 2003 SNIP navigation files and the digitisation of old survey track maps as required. Soon the individual p190 files will be available for download through the new NOPIMS delivery system. The collection is based on P190 navigation files which follows the UKOOA standard. Extensive industry standard metadata associated with a seismic survey is preserved in the attribute tables of these datasets.
The shapefiles have been categorised into 3D exploration, 2D exploration and 2D investigative seismic files. All marine surveys undertaken by Geoscience Australia for exploration or investigative purposes have been included in the collection. Geoscience Australia (email - AusGeodata@ga.gov.au) appreciates being notified of any errors found in the navigation collection.
The data is available in both KML and Shape file formats.
The KML file can be viewed using a range of applications including Google Earth, NASA WorldWind, ESRI ArcGIS Explorer, Adobe PhotoShop, AutoCAD3D or any other earth browser (geobrowser) that accepts KML formatted data.
Alternatively the Shape files can be downloaded and viewed using any application that supports shape files.
Disclaimer: Geoscience Australia gives no warranty regarding the data downloads provided herein nor the data's accuracy, completeness, currency or suitability for any particular purpose. Geoscience Australia disclaims all other liability for all loss, damages, expense and costs incurred by any person as a result of relying on the information in the data downloads.
You can also purchase hard copies of Geoscience Australia data and other products at http://www.ga.gov.au/products-services/how-to-order-products/sales-centre.html
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Are made available:
-the Fortran source code of the model initialization module modified for this simulation work,
"module_initialize_Titan_lakebreeze3d_xy_shoreline.F"
-the input files used to run the simulations,
in "inputs/"
-a list of the simulations done and the netCDF outputs,
"simus_done_for_paper3D.pdf"
"run-##_y0_tsol4.nc.gz" --> slices at a given y (at the center), 4th tsol [for 2D and 3D simulations]
"run-##_y0_tsol3.nc.gz" --> slices at a given y (at the center), 3rd tsol [for 2D and 3D simulations]
"run-##_z0_tsol4.nc.gz" --> slices at a given z (at the surface), 4th tsol [only for 3D simulations]
"run-##_z200_tsol4.nc.gz" --> slices at a given z (at ~200 m), 4th tsol [only for 3D simulations]
"run-##_tsol4_2am.nc.gz" --> total simulation output at a given time (2am on 4th tsol) [only for 3D simulations]
"run-##_tsol4_2pm.nc.gz" --> total simulation output at a given time (2pm on 4th tsol) [only for 3D simulations]
-the Python codes to plot figures from the netCDF output files,
in "postprocessing_python/"
"mtwrf_analysis_1D_t.py" --> plot variables with time at given (x,y,z) [for 2D and 3D simulations]
"mtwrf_analysis_2D_xt.py" --> plot variables with (x,t) at given (y,z) [for 2D and 3D simulations]
"mtwrf_analysis_2D_xz.py" --> plot variables with (x,z) at given (y,t) [for 2D and 3D simulations]
"mtwrf_analysis_2D_xy.py" --> plot variables with (x,y) at given (z,t) [only for 3D simulations]
-- same as the previous ones but to plot along a different x-axis (rotated fron the one of the netCDF files) [only for 3D simulations]
"mtwrf_analysis_1D_t_diagonal.py"
"mtwrf_analysis_2D_xt_diagonal.py"
"mtwrf_analysis_2D_xz_diagonal.py"
-the Matlab variables and figures from the analysis of the simulated lake breeze dimensions
in "postprocessing_matlab/"
This data set is one component of a digital terrain model (DTM) for the SWFWMD Polk District. This record includes information about the LiDAR data for the following SWFWMD tracts: Hampton, Judy, Lake Wales, Peace River (North) and Polk Remainder. All of these tracts are located in Polk County. Please see the Bounding Coordinates for each tract for the location within Polk County. Information that is specific to each tract has been maintained. HAMPTON TRACT This data set is one component of a digital terrain model (DTM) for Hampton Tract, Polk County, Florida encompassing approximately 43 square miles. This dataset is comprised of 48 LiDAR files, based on the DISTRICT 5,000' by 5,000' sheet index system (17951-17958, 18114-18121, 18363-18370, 18526-18533, 18259-18266 and 18594-18601) in the LAS file format. The raw data was collected at an average ground sample distance of 1-meter. Other components of the DTM include: 3-D breaklines along hydrographic features in the Shape file format; lake/pond polygons (in 3D) in the shape file format; obscured area polygons (in 2D) in the Shape file format; and hard/soft breaklines (in 3D) in the Shape file format. Date of Collection: 20060125 Bounding Coordinates of tract: West Bounding Coordinate: -82.035984 East Bounding Coordinate: -81.903345 North Bounding Coordinate: 28.328847 South Bounding Coordinate: 28.238391 JUDY TRACT This data set is one component of a digital terrain model (DTM) for Judy Tract, Polk County, Florida encompassing approximately 12.6 square miles. This dataset is comprised of 14 LiDAR files, based on the DISTRICT 5,000' by 5,000' sheet index system (17632-17636, 17795-17799, and 17959-17962) in the LAS file format. The raw data was collected at an average ground sample distance of 1-meter. Other components of the DTM include a personal geodatabase containing: obscured vegetation polygons; road overpass polygons; road breaklines; soft feature breaklines; water body polygons; coastal shorelines; 1-foot contours; hydrographic feature breaklines, and island polygons in accordance with the SWFWMD 2006 Topographic Database Design. Date of Collection: 20060125 Bounding Coordinates of tract: West Bounding Coordinate: -81.925929 East Bounding Coordinate: -81.848175 North Bounding Coordinate: 28.350110 South Bounding Coordinate: 28.308789 LAKE WALES This data set is one component of a digital terrain model (DTM) for Lake Wales, Polk County, Florida encompassing approximately 10.75 square miles. This dataset is comprised of 12 LiDAR files, based on the DISTRICT 5,000' by 5,000' sheet index system (22365-22367, 22528-22530, 22691-22693 and 22854-22856) in the LAS file format. The raw data was collected at an average ground sample distance of 1-meter. Other components of the DTM include: 3-D breaklines along hydrographic features in the Shape file format; lake/pond polygons (in 3D) in the shape file format; obscured area polygons (in 2D) in the Shape file format; and hard/soft breaklines (in 3D) in the Shape file format. Date of Collection: 20060125 Bounding Coordinates of tract: West Bounding Coordinate: -81.543294 East Bounding Coordinate: -81.509932 North Bounding Coordinate: 27.916603 South Bounding Coordinate: 27.868213 PEACE RIVER (NORTH) This data set is one component of a digital terrain model (DTM) for Peace River North (P692), Polk County, Florida encompassing approximately 1,149 square miles. This dataset is comprised of 1,281 LiDAR files, based on the DISTRICT 5,000' by 5,000' sheet index system in the LAS file format. The raw data was collected at an average ground sample distance of 1-meter. Other components of the DTM include: 3-D breaklines along hydrographic features in the Shape file format; lake/pond polygons (in 3D) in the shape file format; obscured area polygons (in 2D) in the Shape file format; and hard/soft breaklines (in 3D) in the Shape file format. Date of Collection: 20060227 Bounding Coordinates of tract: West Bounding Coordinate: -82.035984 East Bounding Coordinate: -81.903345 North Bounding Coordinate: 28.328847 South Bounding Coordinate: 28.238391 POLK DISTRICT REMAINDER This dataset is one component of a digital terrain model (DTM) for the Southwest Florida Water Management District's FY2007 Remainder Polk District LiDAR Mapping Project and Polk District Contours Project (L672) encompassing approximately 428 square miles in Polk County, Florida. This dataset is comprised of 478 LiDAR files, based on the FL Statewide 5,000' by 5,000' sheet index system in the LAS version 1.1 file format. LiDAR acquisition dates were January 27, January 30 through February 20, 2005. The raw data was collected at an average ground sample distance of 2.1 feet. Other components of the DTM include a personal geodatabase in accordance with the SWFWMD 2006 Topographic Database Design containing: obscured vegetation polygons; road overpass polygons; road breaklines; soft feature breaklines; water body polygons; coastal shorelines; hydrographic features breaklines; island polygons; and 1-foot contours. Final products include FEMA-compliant LIDAR-derived DTM data and 1-foot contours (for cartographic visualization purposes only) meeting or exceeding National Map Accuracy Standards for 2-foot contours. This area is not a tract, but an addition of areas to the Polk District data set. This data set consists of three areas within Polk County. The approximate bounding coordinates for each area are given below. Area 3 borders the east and north sides of the Peace River (North) and Judy Tract data sets. The bounding coordinates for Area 3 are generalized here, but it is actually a multi-sided polygon with many vertices. Date of Collection: 20050127, 20050130-20050220 Bounding Coordinates of Area 1: Area 2 Area 3 West Bounding Coordinate: -82.104809 -81.958339 -81.934587 East Bounding Coordinate: -82.033554 -81.918752 -81.380374 North Bounding Coordinate: 28.314262 28.349889 28.397394 South Bounding Coordinate: 28.171749 28.322179 27.625454
This data set is one component of a digital terrain model (DTM) for Weeki Wachee, Hernando County, Florida encompassing approximately 13.5 square miles. This dataset is comprised of 15 LiDAR files, based on the DISTRICT 5,000' by 5,000' sheet index system, in the LAS file format. The raw data was collected at an average ground sample distance of 1-meter. Other components of the DTM include: 3-D breaklines along hydrographic features in the Shape file format; lake/pond polygons (in 3D) in the shape file format; obscured area polygons (in 2D) in the Shape file format; and hard/soft breaklines (in 3D) in the Shape file format.
Metadata Portal Metadata Information
Content Title | |
Content Type | Hosted Feature Layer, Web Map, Web Application, Aerial Imagery, Basemap, Table, Scene Layer/Scene Layer Package, Datastore, 2D Data, 3D Data, Other, Other Document |
Description | |
Initial Publication Date | DD/MM/YYYY |
Data Currency | DD/MM/YYYY |
Data Update Frequency | Daily, Weekly, Fortnightly, Monthly, Quarterly, Half-Yearly, Yearly, Other, API |
Content Source | Website URL, API, Data provider files, Other |
File Type | CSV (.csv), EPS, ESRI File Geodatabase (.gdb), ESRI Shapefile (.shp), Excel (.xlsx), Geography Markup Language (.gml), GeoPDF, GPS Exchange Format (.gpx), GeoJSON, Industry Foundation Classes (IFC), JSON, Keyhole Markup Language (.kml), Keyhole Markup Language Zip (.kmz), MapInfo (.tab), Scene Layer Package (.slpk), TIFF, Web Feature Service, Well Known Text (*.wkt), Document, Imagery Layer, Map Feature Service, Document Link |
Attribution | |
Data Theme, Classification or Relationship to other Datasets | |
Accuracy | |
Spatial Reference System (dataset) | GDA94, GDA2020, WGS84, Other |
Spatial Reference System (web service) | EPSG:4326, EPSG:3857, EPSG:900913, Other |
WGS84 Equivalent To | GDA94, GDA2020, Other |
Spatial Extent | |
Content Lineage | |
Data Classification | Business Impact Levels (BIL), Commercial, Confidential, For Office Use Only, NSW:Sensitive Law Enforcement, Protected, Secret, Sensitive:Cabinet, Sensitive:Health Information, Sensitive:Legal, Sensitive:Personal, Sensitive:NSW Cabinet, Sensitive:NSW Government, Top Secret, Unclassified |
Data Access Policy | Open, Shared, Restricted, Withdrawn from Service |
Data Quality | |
Terms and Conditions | Creative Commons, Data Sharing Agreement, Memorandum of Understanding, Restricted Licence, Standard Licence |
Standard and Specification | |
Data Custodian | |
Point of Contact | |
Data Aggregator | |
Data Distributor | |
Additional Supporting Information | |
TRIM Number |
Metadata Portal Metadata Information
Content Title | |
Content Type | Hosted Feature Layer, Web Map, Web Application, Aerial Imagery, Basemap, Table, Scene Layer/Scene Layer Package, Datastore, 2D Data, 3D Data, Other, Other Document |
Description | |
Initial Publication Date | DD/MM/YYYY |
Data Currency | DD/MM/YYYY |
Data Update Frequency | Daily, Weekly, Fortnightly, Monthly, Quarterly, Half-Yearly, Yearly, Other, API |
Content Source | Website URL, API, Data provider files, Other |
File Type | CSV (.csv), EPS, ESRI File Geodatabase (.gdb), ESRI Shapefile (.shp), Excel (.xlsx), Geography Markup Language (.gml), GeoPDF, GPS Exchange Format (.gpx), GeoJSON, Industry Foundation Classes (IFC), JSON, Keyhole Markup Language (.kml), Keyhole Markup Language Zip (.kmz), MapInfo (.tab), Scene Layer Package (.slpk), TIFF, Web Feature Service, Well Known Text (*.wkt), Document, Imagery Layer, Map Feature Service, Document Link |
Attribution | |
Data Theme, Classification or Relationship to other Datasets | |
Accuracy | |
Spatial Reference System (dataset) | GDA94, GDA2020, WGS84, Other |
Spatial Reference System (web service) | EPSG:4326, EPSG:3857, EPSG:7844, EPSG:900913, Other |
WGS84 Equivalent To | GDA94, GDA2020, Other |
Spatial Extent | |
Content Lineage | |
Data Classification | Business Impact Levels (BIL), Commercial, Confidential, For Office Use Only, NSW:Sensitive Law Enforcement, Protected, Secret, Sensitive:Cabinet, Sensitive:Health Information, Sensitive:Legal, Sensitive:Personal, Sensitive:NSW Cabinet, Sensitive:NSW Government, Top Secret, Unclassified |
Data Access Policy | Open, Shared, Restricted, Withdrawn from Service |
Data Quality | |
Terms and Conditions | Creative Commons, Data Sharing Agreement, Memorandum of Understanding, Restricted Licence, Standard Licence |
Standard and Specification | |
Data Custodian | |
Point of Contact | |
Data Aggregator | |
Data Distributor | |
Additional Supporting Information | |
TRIM Number |
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
LAS&T is the largest and most diverse dataset for shape, texture and material recognition and retrieval in 2D and 3D with 650,000 images, based on real world shapes and textures.
The LAS&T Dataset aims to test the most basic aspect of vision in the most general way. Mainly the ability to identify any shape, texture, and material in any setting and environment, without being limited to specific types or classes of objects, materials, and environments. For shapes, this means identifying and retrieving any shape in 2D or 3D with every element of the shape changed between images, including the material and texture, orientation, size, and environment. For textures and materials, the goal is to recognize the same texture or material when appearing on different objects, environments, and light conditions. The dataset relies on shapes, textures, and materials extracted from real-world images, leading to an almost unlimited quantity and diversity of real-world natural patterns. Each section of the dataset (shapes, and textures), contains 3D parts that rely on physics-based scenes with realistic light materials and object simulation and abstract 2D parts. In addition, a real-world images benchmark for 3D shapes recognition is also supplied.
3D shape recognition and retrieval.
2D shape recognition and retrieval.
3D Materials recognition and retrieval.
2D Texture recognition and retrieval.
Each can be used independently for training and testing.
Additional assets are a set of 350,000 natural 2D shapes extracted from real-world images (SHAPES_COLLECTION_350k.zip)
3D shape recognition real-world images benchmark
The scripts used to generate and test the dataset are supplied as in SCRIPT** files.
For shape recognition the goal is to identify the same shape in different images, where the material/texture/color of the shape is changed, the shape is rotated, and the background is replaced. Hence, only the shape remains the same in both images. All files with 3D shapes contain samples of the 3D shape dataset. This is tested for 3D shapes/objects with realistic light simulation. All files with 2D shapes contain samples of the 2D shape dataset. Examples files contain images with examples for each set.
3D_Shape_Recognition_Synthethic_GENERAL_LARGE_SET_76k.zip A Large number of synthetic examples 3D shapes with max variability can be used for training/testing 3D shape/objects recognition/retrieval.
2D_Shapes_Recognition_Textured_Synthetic_Resize2_GENERAL_LARGE_SET_61k.zip A Large number of synthetic examples for 2D shapes with max variability can be used for training/testing 2D shape recognition/retrieval.
SHAPES_2D_365k.zip 365,000 2D shapes extracted from real-world images saved as black and white .png image files.
All jpg images that are in the exact same subfolder contain the exact same shape (but with different texture/color/background/orientation).
For texture and materials, the goal is to identify and match images containing the same material or textures, however the shape/object on which the material texture is applied is different, and so is the background and light.
This is done for physics-based material in 3D and abstract 2D textures.
3D_Materials_PBR_Synthetic_GENERAL_LARGE_SET_80K.zip A Large number of examples of 3D materials in physics grounded can be used for training or testing of material recognition/retrieval.
2D_Textures_Recogition_GENERAL_LARGE_SET_Synthetic_53K.zip
Large number of images of 2D texture in maximum variability of setting can be used for training/testing 2D textured recognition/retrieval.
All jpg images that are in the exact same subfolder contain the exact same texture/material (but overlay on different objects with different background/and illumination/orientation).
The images in the synthetic part of the dataset were created by automatically extracting shapes and textures from natural images and combining them in synthetic images. This created synthetic images that completely rely on real-world patterns, making extremely diverse and complex shapes and textures. As far as we know this is the largest and most diverse shape and texture recognition/retrieval dataset. 3D data was generated using physics-based material and rendering (blender) making the images physically grounded and enabling using the data to train for real-world examples. The scripts for generating the data are supplied in files with the world SCRIPTS* in them.
For 3D shape recognition and retrieval, we also supply a real-world natural image benchmark. With a variety of natural images containing the exact same 3D shape but made/coated with different materials and in different environments and orientations. The goal is again to identify the same shape in different images. The benchmark is available at: Real_Images_3D_shape_matching_Benchmarks.zip
Files containing the word 'GENERAL_LARGE_SET' contains synthetic images that can be used for training or testing, the type of data (2D shapes, 3D shapes, 2D textures, 3D materials) that appears in the file name, as well as the number of images. Files containing MultiTests contain a number of different tests in which only a single aspect of the aspect of the instance is changed (for example only the background.) File containing "SCRIPTS" contain data generation testing scripts. Images containing "examples" are example of each test.
The file SHAPES_COLLECTION_350k.zip contains 350,000 2D shapes extracted from natural images and used for the dataset generation.
For evaluating and testing see: SCRIPTS_Testing_LVLM_ON_LAST_VQA.zip
This can be use to test leading LVLMs using api, create human tests, and in general turn the dataset into multichoice question images similar to the one in the paper.