GGL2: 2011

Thursday, November 17, 2011

Raster processing

It's been a long time since we last wrote a post. I've been involved in other projects and I've had no time to work on GGL2. Now we're preparing the workshop at the gvSIG conference and we have made a great breakthrough by adding raster capabilities to the language.

Now it is possible to read raster data from ASCII grid files, TIFFs and JPEGs and access the individual samples:

read ASC '/tmp/dem.asc' to mydem;
show dem[10, 10];

The value at a specific world point can be accessed easily as well. Say "mypoint" is the geometry containing the point we're interested in, the following code woulf give the raster value for that point:

show dem at(ST_X(mypoint), ST_Y(mypoint))

But the hardest part have been the "rexp" expressions. Rexp stands for Raster EXPressions and as it may be taken for "regular expression", we are tempted to change it to "rop" for Raster OPeration. Anyway, it's called Rexp during this post.

Rexps lets the user filter the samples of a raster. For example, say we just want to have the samples in our dem that are higher than 100 meters:

result = rexp dem do (h) {
 if (h > 100) {
   return h;
 } else {
   return null;
 }
};

In the previous construction

"rexp dem" means "I want to process raster stored in 'dem' variable".
"do (h)" says "let 'h' be any sample of the input raster".
The code between curly brackets return the value of the sample for the output raster.

When the sample expression is simple enough it is possible to use the more concise "eval" syntax. For example, the following code produces the shadow of the dem in the north direction:

result = rexp dem eval (h| h[-1, -1] + 2*h[0, -1] 
  + h[1, -1] 
  + h - h[-1, 1] - 2*h[0, 1] - h[1, 1])

It is possible to involve more rasters, georreference them, etc.

If you think rexp is cool, please note that it's not my idea but I copied it from Jiffle, a raster specific language from the super jaitools project[1].

There is a lot of work ahead: supporting more raster formats, optimizing the execution, etc. But ah! when I think how much did it took to do all these operations in GGL1 I get so happy...

More on this at the gvSIG International Conference.

[1] http://jaitools.org/

Monday, August 22, 2011

XML processing

The new 2.0M2 release will be out soon. One of the main reasons to start the development of the second version of the GGL language was the difficulties to access XML data (and any hierarchical data in general) with GGL1 "flat table" model.

This task will be much more easy with GGL2 so let's take a look.

In previous posts we've seen how to access element(feature) attributes using the '/' operator like this:

show myshape/the_geom

The big difference with GGL1 is that elements attributes may be also complex elements so that the slash operator can be used several times in the same expression, provided that the data structure being accessed has those levels of hierarchy:

show car/engine/brand

As, typically, this hierarchical data will be stored in XML files, GGL2 provides a generic XML reader which means that... yes: GGL2 can read any* XML file. For example, in the following code we show the "lat" and "lon" attributes of every point in every route in a GPX file:

import ggl.xml;

read XML '/tmp/fells_loop.gpx' to gpx;

points = gpx/rte/rtept;

foreach p in points {
show p/@lat + ', ' + p/@lon;
}

Another example of the use of XML in GGL2 is the storage of a Shapefile as XML:

import ggl.xml;
import ggl.shp;

read SHP '/tmp/lineas.shp' to lineas;

elem = {linea=lineas};

write XML elem to '/tmp/out.xml';

The previous code produces the following XML file:

<?xml version="1.0" encoding="utf-8"?>
<root xmlns="http://www.gearscape.org/xmlio" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:gml="http://www.opengis.net/gml/3.2" xsi:schemaLocation="http://www.gearscape.org/xmlio out.xsd">
<linea>
<the_geom>
<gml:MultiCurve gml:id="_0">
<gml:curveMember>
<gml:LineString gml:id="_1">
<gml:pos>29.749046840958705 68.23897058823553</gml:pos>
<gml:pos>32.352498638344336 70.79517505787064</gml:pos>
<gml:pos>33.847724332788786 68.72824542143272</gml:pos>
</gml:LineString>
</gml:curveMember>
</gml:MultiCurve>
</the_geom>
<ID>-5.0</ID>
</linea>
<linea>
<the_geom>
<gml:MultiCurve gml:id="_2">
<gml:curveMember>
<gml:LineString gml:id="_3">
<gml:pos>32.352498638344336 70.79517505787064</gml:pos>
<gml:pos>34.61292807053388 68.45558661832814</gml:pos>
<gml:pos>36.78540305010905 67.91421568627474</gml:pos>
</gml:LineString>
</gml:curveMember>
</gml:MultiCurve>
</the_geom>
<ID>-1.0</ID>
</linea>
<linea>
<the_geom>
<gml:MultiCurve gml:id="_4">
<gml:curveMember>
<gml:LineString gml:id="_5">
<gml:pos>30.25038722086067 70.4233419713374</gml:pos>
<gml:pos>30.742932155501194 71.04781858489949</gml:pos>
<gml:pos>31.10354541122015 70.34418296398445</gml:pos>
<gml:pos>30.63738681236394 69.7988653577753</gml:pos>
<gml:pos>31.886340039488125 69.27993408735045</gml:pos>
<gml:pos>32.16779428785414 69.99236515352693</gml:pos>
<gml:pos>32.352498638344336 70.79517505787064</gml:pos>
</gml:LineString>
</gml:curveMember>
</gml:MultiCurve>
</the_geom>
<ID>3.0</ID>
</linea>
</root>

which, you'll guess, is standard GML.

With this functionality we've accomplished the last of the aims that motivated the remake of the language. Some usability and performance issues remain to be solved but we're almost at the end of the road. Soon we'll have a new release and then: courses, workshops, etc.

* Any XML file which schema is understood by the GGL2 XML reader. It's a work in progress and there is no support for some rare (and not so rare) XSD constructions.

Thursday, June 23, 2011

Processing geometries

As said in the post about data types, geometries are simple data types at the same level strings are. Typically this means that they are treated as a whole and there is no way to mess around with the internal components of those values.

However, in case of GGL2 there is a special construction to process the geometry coordinates individually: the "process coordinates" block.

g = LINESTRING(0 0, 10 10);
process g coordinates as coordinate {
show coordinate;
}

These construction lets us process all the coordinates of a geometry, but there's more. If a "replace" statement is used inside the code block it will replace the values of that coordinate with the values specified in that same statement. For example, this makes possible to modify internally a geometry to translate all the coordinates of a geometry with an offset:

process geom coordinates as c {
x = ST_X(c) + deltaX;
y = ST_Y(c) + deltaY;
replace x y z;
}

And the previous code conveniently packaged:

alg ST_Translate(geometry geom, double deltaX, double deltaY) returns geometry {
process geom coordinates as c {
x = ST_X(c) + deltaX;
y = ST_Y(c) + deltaY;
z = ST_Z(c) + deltaZ;
replace x y z;
}

return geom;
}

This method can be used with a single geometry:

show ST_Translate(POINT(0 0), 10, 10);

but also in a "select" statement to process all the elements in a table:

show myShape select (f| ST_Translate(f/the_geom, 10, 10));

Another interesting algorithm may be the ST_Transform one, that performs coordinate system transformations on the geometries. As we'll see in the next post, it is possible to call Java code from GGL2 so it would be relatively easy to implement it using geotools... or proj4.

Tuesday, June 14, 2011

Packaging and reusing algorithms

Now that I've shown you some little tricks with GGL2 it's time to see how we can package some of these tricks for a later use.

Consider the previous post example, the grid creation:

grid = [as geometry];
width = 100;
height = 100;
foreach x1 in 0 .. 1000 step width {
foreach y1 in 0 .. 1000 step height {
double x2 = x1 + width;
double y2 = y1 + height;
geometry polygon = POLYGON((x1 y1, x2 y1, x2 y2, x1 y2, x1 y1));
add polygon to grid;
}
}

It creates a grid that covers the (0, 0)-(1000, 1000) extent with cells 100x100 big. Having such a piece of code may be useful, but having to change manually the values of the extent or the size of the cell is not handy. The solution are "algorithms":

library ggl.grid;

import ggl.geom;

alg createGrid(double cellWidth, double cellHeight, geometry extent) returns sequenceof geometry {
grid = [ as geometry ];
foreach x in ST_MinX(extent) .. ST_MaxX(extent) step cellWidth {
foreach y in ST_MinY(extent) .. ST_MaxY(extent) step cellHeight {
double x_ = x + cellWidth;
double y_ = y + cellHeight;
geometry polygon = POLYGON((x y, x_ y, x_ y_, x y_, x y));
add polygon to grid;
}
}
return grid;
}

Algorithms are contained in libraries that are declared as seen in the first statement. As we'll use some algorithms defined in the "ggl.geom" library we need to import it with the second statement. The rest of the code is similar to the previous post, but wrapped in the algorithm declaration, which defines the "variable" parts of the code as parameters so that it is possible to call the same algorithm with different values. In this concrete example, the parameters are "cellWidth", "cellHeight" and "extent".

It is possible to "call" the algorithm by naming it and passing the concrete values between parenthesis. For example, we could create a grid to cover the (0, 0)-(10, 1) extent with cells 1x1 big:

import ggl.grid;

myGrid = createGrid(1, 1, POLYGON((0 0, 10 0, 10 1, 0 1, 0 0)));

A last note, GGL2 editor can be connected to GIS instances, such as GearScape (and soon gvSIG), making it possible to reference the layers in the GIS instance by their name. In such scenario, it's very simple to create a grid that covers the whole extent of a layer:

import ggl.grid;

grid = createGrid(100, 100, ST_Extent(myLayer/the_geom));

In a next post we'll see some polymorphic features that are very useful when packaging algorithms. We'll also take a look at the possibility to call Java code from GGL2.

Monday, June 13, 2011

Create grid

In a previous post I showed how it was possible to create a polygon using variables inside the WKT expressions. Now we'll see how it is possible to create a grid in few lines of GGL2 code.

Consider that we want to create a grid which cell size is 100x100 and that covers the (0, 0)-(1000, 1000) extent. By using two nested loops that run from 0 to 1000 in increments of 100 it is possible to obtain the upper left corner of every cell:

foreach x1 in 0 .. 1000 step 100 {
foreach y1 in 0 .. 1000 step 100 {
show x1 + ', ' + y1;
}
}

Now in the body of the inner loop we can create a cell using pseudo WKT and add it to a resulting collection:

grid = [as geometry];
width = 100;
height = 100;
foreach x1 in 0 .. 1000 step width {
foreach y1 in 0 .. 1000 step height {
double x2 = x1 + width;
double y2 = y1 + height;
geometry polygon = POLYGON((x1 y1, x2 y1, x2 y2, x1 y2, x1 y1));
add polygon to grid;
}
}

Simple, right? The only thing that may worth explaining is the creation of an empty sequence in the first line... but that will be another post. I'll also show how to package this as an algorithm and to use the extent of an existing data source as the area to cover with the grid.

Saturday, May 28, 2011

Type system (II)

In the previous post we saw the different basic data types available in GGL2. In this one we'll take a look at the composite ones.

There are two posibilities to compose a type: elements and sequences.

Elements are tuples of pairs name=value and typically can be understood as "rows" in a table. For example, in GGL2, a feature read from a shapefile that has an integer attribute called id would be an element with a geometric child called the_geom and a numeric one called id. The element values can be accessed with the / (slash) operator followed by the name of the child:

Feature f = ...;
show f/the_geom;
show f/id;

On the other side, sequences are just that, a sequence of values of a concrete type (either basic or composite). So, if elements can represent shapefile features, sequences of elements can represent the whole shapefile.

There are a lot of constructions to process sequences (that may be shapefiles, postgis tables, etc.). For example, it is possible to filter them to show the elements in the shapefile that have an id lower than 10, or to sort them:

show myShape filter( f| f/id < 10);
show myShape sort( f| f/id);

Finally, note that these constructions are not limited to sequences of elements, they can be applied to any sequence, for example, sequences of ints:

sequenceof int myInts = [1, 2, 3, 5, 7, 9, 11, 13];
show myInts filter( i| i < 10);

The best part of the GGL2 type system is that elements can contain sequences of other elements and, therefore, hierarchical data structures (GPX, GML based formats) will be accessible from GGL2 without problems. I've done no test yet so... that will be another post.

Friday, May 27, 2011

Type system (I)

As GGL2 is a language to treat data, it is specially important to have a type system able to represent the data structutes that are going to be processed. As many other languages, GGL2 lets the user define variables that are strongly typed, this is, that can only contain values of their type (there are exceptions, though).

GGL2 variables can be of simple built-in types or a composite of one or more GGL2 types. In this post, only simple types will be explained. So, for example, it is possible to have numeric and textual variables and initialize them to integer and textual values:

int a = 2;
string text = 'hello';

but, as it is a language to process spatial data, "geometry" is also a basic type and it is possible to have geometric variables and initialize them to geometric literals. What are geometric literals? Well, kind of Well Known Text:

geometry g = POLYGON((0 0, 10 0, 10 10, 0 10, 0 0));
geometry p = POINT(0 0);

The good thing about these WKT literals is that, indeed, they are not literals. I don't want to go deeper on language terminology but, as long as you can use variables in pseudo WKT expressions, they are no longer literals:

int ten = 10;
geometry g = POLYGON((0 0, ten 0, ten ten, 0 ten, 0 0));

In a next post I'll take advantage of this feature to do some cool things with very few code. Stay tunned!

Now, enough about technical details, lets go a little thoughtful: What value do variables give to the language?

Well, consider the buffer in a previous post:

myshape select (f| the_geom=ST_Buffer(f/@the_geom, 10), id=f/@id);

With variables (and loops) we can create several buffers easily:

foreach size in 0 .. 100 step 10 {
myshape select (f| the_geom=ST_Buffer(f/@the_geom, size), id=f/@id);
}

This can be performed in SQL aswell but, IMHO, GGL2 code is easier to produce, read, and therefore, maintain.

A last point on variables, although GGL2 is strongly typed it is possible to declare a variable with, apparently, no type:

a = 2;
text = 'helo';
p = POINT(0 0);

BUT, this doesn't mean that the variable has no type. Indeed, GGL2 takes the type from the right part of the assignation so it is clear that a is an int, text is a string and p is a geometry. This may seem a worthless functionality, but as types get more complicated it is VERY useful.

Thursday, May 26, 2011

Hello buffer world

As hello world is very simple with GGL2:

show 'hello world';

I'm going to show something more complicated: the omnipresent buffer.

import ggl.geom;
import ggl.shp;

read SHP '/tmp/mishape.shp' to myshape;

result = myshape select (f| the_geom=ST_Buffer(f/@the_geom, 10), id=f/@id);

write SHP result to '/tmp/test.shp';

The first two lines import the necessary libraries containing shapefile reader and writer and algorithms to process the geometries. After that, the read statement stores in myshape the contents of the shapefile. Then a select predicate is used to select the buffer of the @the_geom attribute (field) and the @id attribute as it is.

The select predicate iterates over the elements of the shapefile sequence and transforms them as specified between parenthesis. There, the current element is named f and the attributes of the resulting element are the left side of the assignments:

@the_geom=ST_Buffer(f/@the_geom, 10), @id=f/@id

Finally the result is assigned to a result variable that is written in the last statement to /tmp/test.shp.

Basically it's the same that can be accomplished with a SQL select statement, only that a different syntax is used.

Wednesday, May 25, 2011

Welcome to GGL2 blog

In this blog I'll present some examples of what can be done with the next release of GGL2 language. As soon as new constructions are developped I'll put here a post with some examples of what can be done by using it.

Thus, I expect to give some informal insight on what the first release of the language will do. I hope this will make people increasingly excited and cry of joy at the moment of the release.

I'll also raise a priori questions about the best construction to solve a particular kind of problem so that comments in that post can propose ideas that can be implemented later. Your collaboration is very much welcome.

If you don't know what GGL2 is you can take a look here: http://www.gearscape.org/