[Tutor] Help with two .shp file and python coding

Tariq Khasiri tariqkhasiri at gmail.com
Wed Mar 29 09:51:39 EDT 2023


After following the steps

import geopandas as gpd
from shapely.geometry import Point

# Load shapefiles
counties = gpd.read_file('county.shp')
interstates = gpd.read_file('roads.shp')

# To resolve this warning, you can reproject one of the datasets to match
the CRS of the other. In this case, you could reproject the counties
dataset to match the CRS of the interstates dataset.

counties = counties.to_crs(interstates.crs)

# Perform spatial join to find nearest interstate for each county
joined = gpd.sjoin_nearest(interstates, counties, distance_col='distance')

#creating new data frame with necessary columns from the joined dataframe
of total 27 columns
new_joined = joined.loc[:, ['LINEARID', 'FULLNAME', 'RTTYP', 'geometry',
'index_right', 'statefp', 'countyfp', 'distance']]

print(new_joined.sample(10))

my data frames (10 observations) look like this

Now the rest of the code looks like this. But, I'm not sure how I need to
edit my code with the columns of new_joined data frame. On top of it I
need observation of the interstate highways. So from RTTYP column anything
that starts with *I- *, that's my needed observation. The rest of it is not
necessary since they are observations of highways or expressways. Only need
the interstate highway given my goal.

## the rest of the code needs to be edited

# Calculate distance between each county and its nearest interstate
def calculate_distance(row):
    county_point = row.geometry.x, row.geometry.y
    interstate_point = row.geometry_nearest.x, row.geometry_nearest.y
    return Point(county_point).distance(Point(interstate_point))

joined['distance'] = joined.apply(calculate_distance, axis=1)

# Save results to file
joined.to_file('output.shp')

##new_joined.sample 10 observation is in the following section

 LINEARID           FULLNAME RTTYP  \
11187  1104375149380          US Hwy 30     U
658    1105056901139              I- 96     I
4070    110465934760  Gerald R Ford Fwy     M
8925   1102218644426         Edens Expy     M
1254   1104492838322          US Hwy 23     U
3514   1101476091588              I- 90     I
3979    110431486317   Will Rogers Tpke     M
12155  1104762432455              I- 40     I
10332  1105088962603              I- 80     I
12086  1104755971958              I- 94     I

                                                geometry  index_right statefp  \
11187  LINESTRING (-105.52431 41.28870, -105.52369 41...         2773      56
658    LINESTRING (-83.08258 42.31966, -83.08260 42.3...          271      26
4070   LINESTRING (-85.78228 42.87707, -85.78357 42.8...         1083      26
8925   LINESTRING (-87.74634 41.96272, -87.74642 41.9...         1189      17
1254   LINESTRING (-83.99342 34.08687, -83.99330 34.0...          427      13
3514   LINESTRING (-75.97591 43.09333, -75.97613 43.0...           82      36
3979   LINESTRING (-95.27075 36.51022, -95.27031 36.5...         1474      40
12155  LINESTRING (-94.31317 35.45632, -94.31351 35.4...         1337      05
10332  LINESTRING (-121.54758 38.59867, -121.54713 38...          871      06
12086  LINESTRING (-96.28076 46.48266, -96.27988 46.4...          555      27

      countyfp  distance
11187      001       0.0
658        163       0.0
4070       139       0.0
8925       031       0.0
1254       135       0.0
3514       067       0.0
3979       035       0.0
12155      033       0.0
10332      113       0.0


On Tue, 28 Mar 2023 at 09:43, ThreeBlindQuarks <threesomequarks at proton.me>
wrote:

> Tariq,
>
> Your entire program is written in a module called geopandas so to even
> understand it, we would need to find out what exactly various functions in
> that module you call are documented to do.
>
> A little more explanation of what you are doing might be helpful.
>
> Near as I can tell, you are reading in two files to make two structures in
> a pandas format. You join them based on some common columns in whatever
> manner gpd.sjoin_nearest() does so that each row now contains enough info
> for the next step.
>
> You create a function that accepts one such row and extracts the
> coordinates of two parts and calculates a distance that I am guessing is a
> Euclidean distance as the crow flies, not as you drive on roads.
>
> You use that function on every row to create a new column.
>
> Then you save the results in a file.
>
> Does that do something? Who knows. The devil is in the details. The
> mysterious black box that does the initial join would have to uniquely
> match up counties and highways in a way that generates exactly the rows
> needed and exclude any that don't. We (meaning ME) have no assurance of
> what happens in there. Could you end up comparing all counties with their
> distance from somewhere on highway 1?
>
> - Q
>
>
> Sent with Proton Mail secure email.
>
> ------- Original Message -------
> On Monday, March 27th, 2023 at 10:58 PM, Tariq Khasiri <
> tariqkhasiri at gmail.com> wrote:
>
>
> > Hello everyone,
> >
> > For a particular project, I need to collect the distance data from each
> US
> > county to the nearest intersection of interstate highway.
> >
> > These are the publicly provided dataset where anyone has access to the
> > county shp files of usa and roads shp files.
> >
> > this one for county polygons
> >
> https://public.opendatasoft.com/explore/dataset/us-county-boundaries/information/
> >
> > this one for primary roads
> >
> >
> https://catalog.data.gov/dataset/tiger-line-shapefile-2016-nation-u-s-primary-roads-national-shapefile/resource/d983eeef-21d9-4367-9d42-27a131ee72b8
> >
> > `import geopandas as gpd from shapely.geometry import Point # Load
> shapefiles counties = gpd.read_file('path/to/counties.shp') interstates =
> gpd.read_file('path/to/interstates.shp') # Perform spatial join to find
> nearest interstate for each county joined = gpd.sjoin_nearest(counties,
> interstates) # Calculate distance between each county and its nearest
> interstate def calculate_distance(row): county_point = row.geometry.x,
> row.geometry.y interstate_point = row.geometry_nearest.x,
> row.geometry_nearest.y return
> Point(county_point).distance(Point(interstate_point)) joined['distance'] =
> joined.apply(calculate_distance, axis=1) # Save results to file
> joined.to_file('path/to/output.shp')`
> >
> > do you think I can execute my goal successfully with the code snippet
> above
> > ??
> > _______________________________________________
> > Tutor maillist - Tutor at python.org
> > To unsubscribe or change subscription options:
> > https://mail.python.org/mailman/listinfo/tutor
>


More information about the Tutor mailing list