Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support filtering EPT by OGR datasources #2771

Merged
merged 18 commits into from Oct 7, 2019
Merged

Support filtering EPT by OGR datasources #2771

merged 18 commits into from Oct 7, 2019

Conversation

hobu
Copy link
Member

@hobu hobu commented Oct 1, 2019

Add an ogr option to readers.ept:

        "ogr": {
            "datasource":"/Users/hobu/dev/git/pdal/iowa/interstate-80.gpkg",
            "layer":"interstate-80-buffer",
            "sql":"select fid,geometry from \"freeway-exit\" ",
            "drivers":["GPKG"],
            "openoptions":["ADJUST_GEOM_TYPE=FIRST_SHAPE"],
            "options":{
                "geometry":"POLYGON ((-10147169.072127 5108605.30020218,-10147555.2667239 5107917.68543207,-10146792.2969105 5107267.74818361,-10146095.2627599 5107748.13658465,-10146462.618596 5108529.94515888,-10147169.072127 5108605.30020218))",
                "dialect":"OGRSQL"
            }
        },
  • If both layer and sql are provided, `sql is used.
  • If both bounds and sql are provided, they are intersected
  • drivers is a list of OGR driver shortnames to open the data source with (optional)
  • openoptions is a list of driver open options (optional)
  • options.geometry is a filter geometry (wkt or geojson) that is passed to the ExecuteSQL layer creation
  • dialect tells OGR which SQL dialect to use.

Polygon p = pdal::Polygon(g, srs);
p.transform(getSpatialReference());
m_args->m_polys.emplace_back(p);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to delete the geometries in ogr_geoms.

Copy link
Contributor

@abellgithub abellgithub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried to ignore all of Connor's EPT stuff.

// convert whatever we were given for an OGR source to
// polygons and override m_args->m_polys
m_args->m_polys.clear();
std::vector<OGRGeometry*> ogr_geoms = pdal::gdal::fetchOGRGeometries(m_args->m_ogr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any reason for us to pollute EPT with GDAL awfulness. Seems like this should be hidden in GDALUtils. I think all you do is end up converting to pdal::Polygon anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. Then we must pollute pdal::Polygon with GDAL datasource reading.

io/EptReader.cpp Outdated
// Apply our overrides from the fetched layer by taking
// the intersection of the layer's bounds with whatever the
// user provided before
BOX3D filtered = m_args->m_bounds.to3d();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is excessively complicated. One use of cropping seems sufficient, no? I think I would error if you specify both bounds and the OGR filtering. I'm not sure what extra you get out of using both at the same time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed. See latest

for(const auto& s: driverOptions)
papszDriverOptions = CSLAddString(papszDriverOptions, s.c_str());
}
std::vector<const char*> openoptions{};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll suggest:

std::vector<std::string> openOpts;
size_t optCount = ogr.count("openoptions");
std::vector<const char *> opts(optCount + 1);
if (optCount)
    openOpts = ogr["openoptions"];
for (size_t i = 0; i < openOpts.size() - 1; ++i)
    opts[i] = openOpts[i].data();

You just have to make sure that openOpts stays in scope until you're done. The names I chose could be better.

The dataset code would look about the same. Guess you could write a lambda.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it the GDAL way so it was clear it was a GDAL thing, and if there were any weird DLL boundary issues with the array it was GDAL's fault and not ours.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There shouldn't be any DLL boundary issues because the memory is allocated/destroyed inside PDAL.


std::vector<OGRGeometry*> fetchOGRGeometries(const NL::json ogr)
{
const NL::json& layer = ogr.at("layer");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why I hate JSON. It gives the appearance of being simple, but it's just a mess because there's no schema and there's no checking here for anything going wrong. We're pretty good about providing feedback with pipelines and this does none of that checking. As awful the JSON error checking is from a code standpoint, I think that it should be added. Users should know when something is mis-specified and letting it fall down into "bad query" or something similar is really difficult, especially when you have a one character spelling error or something for which we could provide much more specific feedback.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no schema and there's no checking here for anything going wrong

This code throws if the expected keys are not found. Keys that are optional are handled. It happens at runtime. There's a bunch of stuff we don't know about the given datasource until we try to open it. Opening OGR datasources can be expensive. We don't want to do it multiple times if we can avoid it.

We have no way in ProgramArgs to handle stuff like this ogr option. It doesn't make sense to flatten it out into 20 different options on a stage, and we also want to add this capability to other stages like filters.crop.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not suggesting that we do something other than JSON. I'm suggesting that the JSON input be checked for errors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check for errors at what time? initialize() is when it is happening right now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fine time to check the pipeline input.

I think it would be better to do something like build the query in initialize() and then use it in ready(). But I'd have to look at the whole reader a bit to be sure what's feasible and reasonable.

My concern is just that the input validation occur at some point. Right now it's very loose, which is what tends to happen with this JSON stuff. I blame at least a couple of days' time with 3D tiles on this.

while( (poFeature = poLayer->GetNextFeature()) != NULL )
{
OGRGeometry* poGeometry = poFeature->GetGeometryRef();
OGRGeometry* clone = poGeometry->clone();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better just to create pdal::Polygon here. It already handles all the stuff with OGRGeometry internally.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pdal::Polygon includes GDALUtils for its implementation. We would have to refactor things more heavily to do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can take that on once this is committed.

@@ -74,6 +74,19 @@ Geometry::Geometry(OGRGeometryH g, const SpatialReference& srs) :
}


Geometry::Geometry(OGRGeometryH gh, OGRSpatialReferenceH srs) :
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way to do this, IMO, is to add a ctor to GDALUtils::SpatialRef that takes an OGRSpatialReferenceH and clones it. Then everything else stays the same. No changes to Geometry or Polygon or SpatialReference and it keeps the GDAL code isolated.

std::string srs;
if (isJson)
{
filterGeometry = createFromGeoJson(wkt_or_json, srs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems leaky for filterGeometry. Maybe we should change things so that createFrom... creates pdal::Geometry().

@hobu hobu added this to the 2.1 milestone Oct 2, 2019
@abellgithub
Copy link
Contributor

If you merge master into this it should be close.

@hobu hobu merged commit 531b21c into master Oct 7, 2019
@hobu hobu modified the milestones: 2.1, 2.0.2 Oct 15, 2019
@hobu hobu modified the milestones: 2.0.2, 2.1 Dec 20, 2019
@hobu hobu added the feature label Jan 31, 2020
@hobu hobu deleted the ogr-filter-options branch February 3, 2020 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants