City of Corona
Home MenuGovernment » Departments/Divisions » Information Technology
Data Glossary
A
Access Control | The process of granting or denying specific requests for obtaining and using information and related information processing services. |
Application | A software that communicates with other software that connects to a database or runs as a client on a desktop or mobile operating system. Applications are a way of consuming and utilizing data. |
API (Application Programming Interface | A computing interface that defines interactions between multiple software or mixed hardware-software intermediaries. In simple terms APIs can be used to connect to data and extract it. |
Attribute | A characteristic of a geographic or spatial feature, typically stored in tabular format and linked to the feature in a relational database. The attributes of a well-represented point might include an identification number, address, and type. |
B
Base Layer | A primary layer for spatial reference, upon which other layers are built. Examples of a base layer typically used are either the parcels, or street centerlines. |
Big Data | High-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation. |
Buffer | A zone of a specified distance around a feature. |
BI (Business Intelligence) | Business intelligence (BI) refers to capabilities that enable organizations to make better decisions, take informed actions, and implement more-efficient business processes. |
BI (Business Intelligence) Platforms | Software platforms that enable enterprises to build BI applications by providing capabilities in three categories: analysis, such as online analytical processing (OLAP); information delivery, such as reports and dashboards; and platform integration, such as BI metadata management and a development environment. |
C
Catalog | A catalog is a collection of datasets, maps and other visuals. |
Category | Methodology by which items or datasets are classified or grouped under a similar theme or topic. Also referred to as Taxonomy. |
Change Control | The processes, authorities for, and procedures to be used for all changes that are made to the computerized system and/or the system's data; a vital subset of the Quality Assurance program and should be clearly described in the establishment's SOPs. |
Collection | The acquisition of information and the provision of this information to processing elements. |
Comma Separated Values File (CSV | A standard format for spreadsheets where data is stored in a plain text file, with each data row on a new line and commas separating the values on each row. |
Content Management (CM) | A broad term referring to applications and processes to manage Web content, document content and e-commerce-focused content. |
Computer Aided Design (CAD) | An automated system for the design, drafting and display of graphically oriented information. |
Coordinate | An x,y location in a Cartesian coordinate system or an x,y,z coordinate in a three dimensional system. Coordinates represent locations on the Earth’s surface relative to other locations. |
D
Dashboard | A centralized, interactive, and visual display of data used to monitor conditions or facilitate understanding. |
Data | Data can be defined broadly as information collected on a specific subject. Data types can be divided further into structured data (organized in rows and columns) or unstructured data (images). Data includes numbers or information located in tables, graphs, maps, images and others. Data can be used to understand spatial and temporal trends of a phenomenon, to figure out associated factors for the phenomenon and also future prediction of the phenomenon. |
Data Driven | An approach of making strategic decisions based on data analysis and interpretation. |
Database | A logical collection of interrelated information, managed and stored as a unit. A GIS database includes data about the spatial location and shape of geographic features recorded as points, lines, and polygons as well as their attributes. |
Dataset | A dataset is any organized collection of data. Dataset is a flexible term and may refer to an entire database, a spreadsheet or other data file, or a related collection of data resources. |
Data Administrator | One who manages access, security, and integrity of the database and monitors the performance of the database system to maintain any established service level agreements. |
Data Analytics | Interpretation of information in context, typically through use of statistical measures, data models, reports, and dashboards. |
Data Architecture | Architectural framework for how data is stored, managed, and used in a system; describes how data is persistently stored, how components and processes reference and manipulate this data, how external/ legacy systems access the data. |
Data Archiving | The set of practices around the storage and monitoring of the state of digital material over the years. |
Data Asset | Any entity that is comprised of data; may be a system or application output file, database, document, or web page; also includes a service that may be provided to access data from an application. |
Data Breach | The loss, theft, or other unauthorized access to data containing sensitive personal information, in electronic or printed form, that results in the potential compromise of the confidentiality or integrity of the data. |
Data Catalog | An organized inventory of data assets in the organization. It uses metadata to help organizations manage their data. It also helps data professionals collect, organize, access, and enrich metadata to support data discovery and governance. |
Data Classification | The assignment of a level of sensitivity to data that results in the specification of controls for each level of classification. Levels are assigned according to predefined categories as data are created, amended, enhanced, stored or transmitted. |
Data Cleaning or Scrubbing | Data in its raw form needs to be cleaned and processed to eliminate errors. Data scrubbing can take as much as 80% of an analyst’s time in the entire analytical process. |
Data Corruption | A violation of data integrity. |
Data Custodian | An IT professional working with data systems to ensure data is stored, transported, and accessed appropriately. |
Data Dictionary | A collection of information about data such as name, description, creator, owner, provenance, translation in different languages, and usage. |
Data Element | A basic unit of information that has a unique meaning and subcategories (data items) of distinct value. Examples of data elements include gender, race, and geographic location. |
Data Element Standardization | The process of documenting, reviewing, and approving unique names, definitions, characteristics, and representations of data elements according to established procedures and conventions. |
Data Flow | The flow of data from the input to output. Data flow includes travel through the communication lines, routers, switches, and firewalls as well as processing through various applications on servers. |
Data Governance | A set of processes that ensures that data assets are formally managed throughout the enterprise. A data governance model establishes authority and management and decision-making parameters related to the data produced or managed by the enterprise. |
Data Integration | The process of retrieving data from multiple source systems and combining it in such a way that it can yield consistent, comprehensive, current, and correct information for business reporting and analysis. |
Data Integrity | A property whereby data has not been altered in an unauthorized manner since it was created, transmitted, or stored. |
Data Interoperability | Interoperability concerning the creation, meaning, computation, use, transfer, and exchange of data. |
Data Item | A named component of a data element; usually the smallest element. |
Data Lake | A concept consisting of a collection of storage instances of various data assets. These assets are stored in a near-exact, or even exact, copy of the source format and are in addition to the originating data stores. |
Data Life Cycle | The sequence of stages that a particular unit of data goes through from its initial generation or capture to its eventual archival and/or deletion at the end of its useful life. |
Data Lineage | Data lineage is the journey data takes from its creation through its transformations over time. It describes a certain dataset’s origin, movement, characteristics and quality. |
Data Literacy | The ability to read, write, and communicate data in context, with an understanding of the data sources and constructs, analytical methods and techniques applied, and the ability to describe the use case application and resulting business value or outcome. |
Data Loss Protection | A set of technologies and inspection techniques used to classify information content contained within an object — such as a file, email, packet, application or data store — while at rest (in storage), in use (during an operation) or in transit (across a network). |
Data Management | The practice of putting into place policies, procedures, and best practices to ensure that data is understandable, trusted, visible, accessible and interoperable. |
Data Mapping | A method used to identify and link selected data to one or more equivalent standard data elements. |
Data Mining | An analytical process that attempts to find correlations or patterns in large data sets for the purpose of data or knowledge discovery. |
Data Modeling | Identifies informal graphical and textual representation and the entities and relationships involved in a data process; provides a mechanism for understanding the intended activity of a new system and designing the data. |
Data Ops | DataOps is the hub for collecting and distributing data, with a mandate to provide controlled access to systems of record for customer and marketing performance data, while protecting privacy, usage restrictions and data integrity. |
Data Owner | The individual(s), normally a manager or director, who has responsibility for the integrity, accurate reporting and use of computerized data. |
Data Portal | Data portal is a platform where data and maps from other valid sources are published and shared with the public free of cost, and is regularly maintained and refreshed to keep it recent and relevant for the users. |
Data Protection | The implementation of appropriate administrative, technical, or physical means to guard against unauthorized intentional or accidental disclosure, modification, or destruction of data. |
Data Quality | The planning, implementation, and control of activities that apply quality management techniques to data, in order to assure it is fit for consumption and meet the needs of data consumers. |
Data Schema | A specification that defines the structure of the data (required data elements and types, and supporting definitions). |
Data Security | Physical, technical, and administrative measures used to safeguard protected information from unauthorized access, modification, use, disclosure, or destruction. |
Data Set | A collection of related records. |
Data Standards | The rules by which data are described and recorded. In order to share, exchange, and understand data, we must standardize the format as well as the meaning. |
Data Steward | One who oversees and maintains consistent reference data and master data definitions, publishes relevant interpretation and proper usage of the data, and ensures the quality of the content and metadata. |
Data Stewardship | The most common label to describe accountability and responsibility for data and processes that ensure effective control and use of data assets. |
Data Strategy | A highly dynamic process employed to support the acquisition, organization, analysis, and delivery of data in support of business objectives. |
Data Structure | The relationships among files in a database and among data items within each file. |
Data Table | Data table refers to a tabular form of data that can be displayed in rows and columns. |
Data Users | Any individual or organization that accesses, downloads, analyzes, or who uses data to develop apps, visualizations, reports, and other information products or services. |
Data Validation | A process used to determine if data are inaccurate, incomplete, or unreasonable; the checking of data for correctness or compliance with applicable standards, rules, and conventions. |
Data Visualization | A way to represent information graphically, highlighting patterns and trends in data and helping the reader to achieve quick insights. The process of logically removing data from a read/write medium so that it can no longer be read. Performed externally by physically connecting storage media to a hardware bulk-wiping device, or internally by booting a PC from a CD or network, it is a nondestructive process that enables the medium to be safely reused without loss of storage capacity or leakage of data. |
Data Warehouse | A data warehouse is a type of data management system that is designed to enable and support business intelligence (BI) activities, especially analytics. Data warehouses are solely intended to perform queries and analysis and often contain large amounts of historical data. The data within a data warehouse is usually derived from a wide range of sources such as application log files and transaction applications. |
Data Wiping | The process of logically removing data from a read/write medium so that it can no longer be read. Performed externally by physically connecting storage media to a hardware bulk-wiping device, or internally by booting a PC from a CD or network, it is a nondestructive process that enables the medium to be safely reused without loss of storage capacity or leakage of data. |
Descriptive Analytics | The examination of data, usually manually performed, to answer the question “What happened?” (or What is happening?), characterized by traditional business intelligence (BI) and visualizations such as pie charts, bar charts, line graphs, tables, etc. |
Diagnostic Analytics | A form of advanced analytics which examines data or content to answer the question “Why did it happen?” and is characterized by techniques such as drill-down, data discovery, data mining and correlations. |
Digital Transformation | Digital transformation can refer to anything from IT modernization (for example, cloud computing), to digital optimization, to the invention of new digital business models. The term is widely used in public-sector organizations to refer to modest initiatives such as putting services online or legacy modernization. Thus, the term is more like “digitization” than “digital business transformation.” |
Digitize | To encode map features as x,y coordinates in digital form. Lines are traced to define their shapes. This can be accomplished either manually or by use of a scanner. |
E
ETL (Extraction, Transformation, Loading) | A data pipeline used to collect data from various sources, transform the data according to business rules, and load it into a destination data store. |
F
File Format | The file format refers to the internal arrangement (format) of the file, not how it is displayed to users. For example, CSV , XLS, JSON files are structured very differently, but may look similar or identical when opened in a spreadsheet program. The format corresponds to the last part of the file name or extension. |
Flat Files | A flat file is an informal term for a single table of data from which all word processing or other structure characters or markup have been removed. A flat file stores data in plain text format. Because of their simple structure, flat files can only be read, stored and sent. CSV files are one of the most common types of flat files. |
G
Geocode | The process of identifying a location by one or more attributes from a base layer. |
Geographic Information System (GIS) | An organized collection of computer hardware, software, geographic data, and personnel designed to efficiently capture, store, update, manipulate, analyze, and display all forms of geographically referenced information. |
Geospatial Data | Data related to the position of things in the real world, including boundaries or locations. |
I
Information Life Cycle | Information life cycle, as defined in OMB Circular A-130, means the stages through which information passes, typically characterized as creation or collection, processing, dissemination, use, storage, and disposition. |
Information Management | The function of managing an organization’s information resources for the handling of data and information acquired by one or many different systems, individuals, and organizations in a way that optimizes access by all who have a share in that data or a right to that information. |
Interoperability | The capability to communicate, execute programs, or transfer data among various functional units in a manner that requires the user to have little or no knowledge of the unique characteristics of those units. |
L
Layer | A logical set of thematic data described and stored in a map library. Layers act as digital transparencies that can be laid atop one another for viewing or spatial analysis. |
Line | ines represent geographic features too narrow to be displayed as an area at a given scale, such as contours, street centerlines, or streams. |
M
Master Data | Data held by an organization that describes the entities that are both independent and fundamental for an enterprise that it needs to reference in order to perform its transaction. |
Metadata | Information describing the characteristics of data including, for example, structural metadata describing data structures (e.g., data format, syntax, and semantics) and descriptive metadata describing data contents (e.g., information security labels). |
Microsoft Azure | Microsoft Azure provides infrastructure as a service, platform as a service, and serverless computing environments, in addition to a multitude of cloud service offerings. |
O
Ortho Imagery (Ortho) | Aerial photographs that have been rectified to produce an accurate image of the Earth by removing tilt and relief displacements, which occurred when the photo was taken. |
Open Data | Public data that are made available consistent with relevant privacy, confidentiality, security, and other valid access, use, dissemination restrictions, and structured in a way that enables the data to be fully discoverable and usable by end users. |
P
PII (Personally Identifiable Information) | Any representation of information that permits the identity of an individual to whom the information applies to be reasonably inferred by either direct or indirect means. |
Point | A single x,y coordinate that represents a geographic feature too small to be displayed as a line or area at that scale. |
Polygon | A multisided figure that represents area on a map. Polygons have attributes that describe the geographic feature they represent. |
Power BI | Microsoft’s visualization and BI report platform. Part of the wider Power Platform line of tools for BI, automation, app development, and app connectivity. |
Predictive Analytics | A form of advanced analytics which examines data to answer the question “What is going to happen?” or more precisely, “What is likely to happen?”, and is characterized by techniques such as regression analysis, predictive modeling, and forecasting. |
Prescriptive Analytics | Advanced analytics which examines data or content to answer the question “What should be done?” or “What can we do to make _______ happen?”, and is characterized by techniques such as graph analysis, simulation, complex event processing. |
Python | An open-source interpreted high-level general-purpose programming language. In the context of data analytics, Python is an industry standard language for carrying out mathematical operations, data cleansing, data transformation, data visualization, data modeling, and data mining tasks thanks to a wide and well-supported ecosystem of libraries. |
Q
Query | A request for data or information from a database table or combination of tables. This data may be generated as results returned by Structured Query Language (SQL). |
R
Record | A group of related data elements treated as a unit. [A data element (field) is a component of a record, a record is a component of a file (database)]. |
Records Management | The process for tagging information for records keeping requirements as mandated in the Federal Records Act and the National Archival and Records Requirements. |
Relational Database | A database in which the data are organized according to a relational model. |
Relational Database Management System | A management system for relational database. In order to use relational data base management systems, it is necessary to represent relational model of data that organizes data with specific characteristics (tables or relations, unique key, etc.) |
Report | A static document, table, or visualization that gathers data into one place and presents it visually. |
Risk Management | The process of managing risks to organizational operations (including mission, functions, image, or reputation), organizational assets, or individuals resulting from the operation of an information system. |
S
Scale | The ratio or relationship between a distance or area on a map and the corresponding distance or area on the ground. |
Shapefile | The shapefile format is a popular geospatial vector data format for geographic information system (GIS) software. The shapefile format can spatially describe vector features: points, lines, and polygons, representing, for example, water wells, rivers, and lakes. |
Spatial | Spatial is a metadata term that means the dataset has locational (geographic) information such as coordinates, address, city, or ZIP code. |
Spatial Analysis | The process of modeling, examining, and interpreting model results. Spatial analysis is useful for evaluating suitability and capability, for estimating and predicting, and for interpreting and understanding. |
Structured Data | Refers to data that conforms to a fixed schema. Relational databases and spreadsheets are examples of structured data. |
Structured Query Language (SQL) | A syntax for defining and manipulating data from a relational database. Developed by IBM in the 1970s, it has become an industry standard for query languages in most relational database management systems. |
Self-Service Analytics | A form of BI in which line-of-business professionals are enabled and encouraged to perform queries and generate reports on their own with nominal IT support. Characterized by simple-to-use BI tools, dashboards, and use of aliasing and semantic layers to make data easier to interpret. |
Sensitive Data | Any designated data or metadata that is used in limited ways and/or intended for limited audiences; may include personal data, corporate or government data, and mishandling of published sensitive data may lead to damages to individuals or organizations. |
Standard | A standard is a document that provides requirements, specifications, guidelines or characteristics that can be used consistently to ensure that materials, products, processes and services are fit for their purpose. |
T
Tag | A tag is a keyword or term assigned to a piece of information or a file. This type of metadata helps describe an item and allows it to be found by browsing or searching. |
Theme | An ArcView theme stores map features as primary features (such as arcs, nodes, polygons, and points) and secondary features such as tics, map extent, links, and annotation. A theme usually represents a single geographic layer, such as soils, roads, or land use. |
U
Unstructured Data | Data that is more free form, such as multimedia files, images, sound files, or unstructured text. Unstructured data does not necessarily follow any format or hierarchical sequence, nor does it follow any relational rules. |
V
Visualization | A visual representation of data, such as a chart, graph or dashboard, is often the easiest way of communicating with data, bringing out its key features. Many visualization tools exist such as Google Charts, Excel, ArcGIS, Tableau, and PowerBI. |
X
XML | Extensible Markup Language, is a text-based format designed to represent structured information, store, transport and share data over the Internet. XML is both human- and machine-readable. |
The following definitions and descriptions have been adapted from the following sources:
National Institute of Standards and Technology
Socrata Data Insights & Knowledge Base
Dallas Office of Data Analytics and BI