The CVS data format stores cartographic data for a specific geographic area into a single file. Cesar examines the format, then presents a tool for converting CVS files into DXF format.
May 01, 1999
URL:http://www.drdobbs.com/database/the-cvs-data-format/184410936
Cesar is a researcher for the Landscape Archaeology Research Unit at the University of Santiago de Compostela in Spain. He can be contacted at [email protected].
As a computer specialist working with archaeologists, I've found many areas of activity that suffer from lack of appropriate tools and methods. One of the most notorious areas involves the use of cartographic information to locate and set in context archaeological sites and other geographical places. Paper maps are often the only means of dealing with geographical locations, apart from lists of coordinates, which seldom solve any problem. Of course, commercial Geographical Information System (GIS) packages exist, but none combine power with ease of use. They tend to have too many features and are less than intuitive for those lacking computer training -- not to mention they are usually expensive or a pain in the neck to use.
Consequently, I and other members of the Landscape Archaeology Research Unit at the University of Santiago de Compostela decided to invest in research on simple cartographic representations for geographic location and reference. As a result, we designed and implemented a new data format and a small set of accompanying tools.
The biggest problem when dealing with cartographic information is the huge amount of data needed to acceptably manage and display a medium-sized area. Our archaeological work is strongly based on the zoom principle, which says that any study must be done at several scales centered around the same area to be precise and in context. Also, our area of work covers the whole Galicia, over 30,000 square kilometers. In addition, we specialize in archaeological impact assessment, which often involves working in linear-track works such as motorways or pipelines, involving very long and narrow work areas instead of the classical circular ones.
We envisioned a system capable of displaying a layered contour map, relieving users from intrusive tasks such as changing sheets after hitting a sheet border or changing scales. Also, a major problem in some GIS tools is the huge number of files they create. Since we believe that the users shouldn't have to worry about thousands of files and relationships among them, we decided that the system should integrate all the information about a specific wide area in a single file, including different levels of detail. We did not attempt to perform automatic geographic generalization, but instead to store already-computed data about different levels of detail into one file. The system would then select the most appropriate data set from context information, such as the working scale, output destination, and user preferences.
Furthermore, the huge amount of information required to deal with cartography results in the need to index the data inside the files so retrieval is fast enough. A 2D indexing scheme was needed, because cartographic data is almost always retrieved following an inside-rectangle test. Our own experiments showed that raw lists of coordinates with no indexing did well in small areas (up to 50 km2), but performed badly above this limit.
The result of our work is the CVS (which, in English, stands for the "Segmented Vectorial Cartography") data format, which stores homogeneous cartographic data for a specific geographic area into a single file, optionally including different levels of detail and offering a two-dimensional indexing scheme.
A CVS file holds what in classical terms could be called a layer, or information relative to a single thematic coverage. The layer concept has been extensively used in the GIS world and is beyond the scope of this discussion. To obtain a complete map, several layers are usually necessary, so several CVS files are needed.
The information inside a CVS file is partitioned into levels, corresponding each to a level of detail at which the geographic information of the area can be represented. In fact, all the levels in a CVS file represent the same area, but at different details. Thus, each level is more suitable to be displayed for a specific range of working scales. Levels are the means by which the zoom principle can be successfully applied.
Also, the information inside a CVS file is partitioned into sectors, corresponding each to a rectangle on the area to be represented. Division in sectors is made separately for each level, so low-detail levels can be partitioned in few sectors (even in just one), and high-detail levels can be divided in up to 65,536 sectors in the current implementation. Sectors are the way to achieve two-dimensional indexing.
Every sector in a CVS file contains curves, as the CVS format is initially oriented to deal with contour maps. Each curve is stored as a sequence of points. A curve spanning two or more sectors is accordingly split in as many curve segments as needed, all of them with the same curve identifier. We plan to improve the CVS specification with capabilities to store different kinds of information other than curves.
As Figure 1 illustrates, each CVS file contains a file header, data header, one or more levels, and one or more sectors for each level. The file header contains a magic number identifying the file as a CVS file, information about format revision (currently Version 2), and room for future extensions such as the content type. Currently, no content type is specified as only one is implemented. The data header holds the minimum and maximum values for x-, y-, and z-coordinates in the whole file, the number of levels in the file, and some data for each of them. In turn, each level contains a level header and data for each sector inside it. The level header carries the sector count for this level, and some information for each of them. Each sector holds a sector header and the cartographic information itself in the form of curves. Curves are not indexed, and consist of a curve identifier and a sequence of coordinate triplets.
The file pointers from the LevelInfo and SectorInfo data elements in Figure 1 point to LevelData and SectorData, respectively, and constitute the foundation of the indexing mechanism. The ScaleFrom and ScaleTo fields for each LevelInfo are stored in meters per pixel (MPP), a good way to express scale on digital media. The higher these values, the lower the level of detail. On a 17-inch monitor with a resolution of 1024×768 pixels, 100 MPP correspond to a 1:320,000 conventional scale. Also, 64-bit floating-point numbers are used to store coordinates, allowing the CVS data format to deal with Universal Transverse Mercator (UTM) coordinates, our system of choice as the whole of Galicia is contained into a single UTM zone.
Finally, the CVS data format performs a little trick to improve data retrieval performance. Each sector stores all the vertices of the curves it includes, plus two more optional vertices for each curve, one before the first vertex inside the sector (in case the curve starts outside the sector), and the other after the last vertex (in case the curve ends outside the sector). This offers the whole path of a curve segment for each sector. See Figure 2 for details on these off-by-one vertices.
Assume that a CVS file is stored on disk, and that some piece of software wants to read it to display a map. After checking the magic number to reduce the risk of file type conflicts, and verifying that the CVS revision of the file is compatible with that of itself, the software checks that the map it intends to display is intersected by the area specified by the data header fields FromX, FromY, FromZ, ToX, ToY, and ToZ. If not, no useful data is contained in the CVS file. If this test is successful, the software scans through every LevelInfo element to find the one with an appropriate range of scales, looking at the ScaleFrom and ScaleTo fields. Once found, the software can follow the LevelInfo's Pointer into a LevelData, which will contain a header with the sector count and some information for each sector. Scanning through SectorInfo elements, the software builds a list of which sectors are to be retrieved to draw the map, by computing whether or not each sector area, given by the FromX, FromY, ToX, and ToY fields, intersects the wanted map area. Once this list is built, the software must iterate over it, navigating to the cartographic information by using each SectorInfo's Pointer field into a corresponding SectorData. From this element, the software reads the curve count and starts iterating over every curve. Curves are not indexed or delimited, so retrieving all the curves in a sector is, in the current form of CVS, a strictly sequential process. Each curve starts with a CurveHeader element that holds a curve identifier and a vertex count, after which follows a sequence of vertices, each one consisting of x-, y-, and z-coordinates.
We've developed a number of tools to work with CVS files.
The CVS data format is currently being used to provide cartographic facilities to our main information system, used by 25 simultaneous users several hours a day. The CVS files being used cover the whole Galicia, and integrate the full 1:100,000 cartography of this area. CVS files live on our applications server, and each client reads them through a local copy of the CVS access library. The first improvement we have made to the described set of tools is to port the CVS access library from Visual Basic 5 to Visual C++ 5, achieving some performance improvements. (We have not performed measured tests, but our experience indicates that slight improvements are mainly due to the disk-access mechanisms used by C++ libraries in comparison to that of Visual Basic. Thanks to my colleague Roberto Gomez, who ported the CVS access library into Visual C++.) Currently, we are planning to redesign it as a server-side component so only the selected sectors travel through the network, and the sequential portion of the work (iterating over all curves in each sector) benefits from being executed on the server. We have experimented with DCOM and found it suitable for a design like this.
We are also planning to convert the CVSTest tool into an ActiveX control, including a canvas and enough functionality to draw and manage multilayer maps, so any application written in any ActiveX-hosting language could use it. Also, the DAT2DXF converter must be improved both at the performance and disk space requirements sides. Finally, extending the CVS data format to host content kinds other than curves is easy and will be done sometime. We consider digital elevation models and archaeological site distributions as candidate content kinds.
We routinely overlay other data that we use (such as archaeological sites) on top of CVS layers, pulling it from Microsoft SQL Server 6.5, Microsoft Access 96, and CA Jasmine databases, depending on the system. Our main internal working system, the SIA+ Archaeological Information System (an integrated information system for the management of archaeological sites and finds, assessments, projects, people, documents, and images; see http://wwwgtarpa.usc.es/), pulls data from a 45-MB Access database to show geographic locations, zones, and sites atop the CVS layers.
The CVS data format is an inexpensive and easy-to-use solution for those applications that need displaying and making operations with contour maps. We know that many improvements are still necessary to make the CVS data format a professional solution. Any help or collaboration will be welcome.
DDJ
'open a CVS file. Dim ly As New Layer ly.OpenFile "C:\Temp\Test.cvs" 'get the first level. Dim lv As Level Set lv = ly.Levels(1) 'iterate all sectors. Dim lSectorIdx As Long Dim sc As Sector For lSectorIdx = 1 To lv.Sectors.Count 'get sector. Set sc = lv.Sectors(lSectorIdx) 'output data. Debug.Print "Sector " & CStr(lSectorIdx) & ":" 'begin retrieving curve data for this sector. Dim lCurveCount As Long sc.BeginGetData lCurveCount 'iterate all curves in this sector. Dim lCurveIdx As Long, lId As Long For lCurveIdx = 1 To lCurveCount 'get curve info. Dim lVertexCount As Long sc.GetCurveInfo lId, lVertexCount 'output data. Debug.Print " Curve " & CStr(lId) & " with " & CStr(lVertexCount) & " vertices:" 'iterate vertices for this curve. Dim lVertexIdx As Long For lVertexIdx = 1 To lVertexCount 'get vertex data. Dim dX As Double, dY As Double, dZ As Double Dim bInside As Boolean sc.GetVertex dX, dY, dZ, bInside 'output data. Debug.Print " Vertex (" & CStr(dX) & ", " & CStr(dY) & ", " & CStr(dZ) & ") " & CStr(bInside) Next lVertexIdx Next lCurveIdx 'end retrieving curve data. sc.EndGetData Next lSectorIdx 'close CVS file. ly.CloseFile
Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.