[Tutor] Help with two .shp file and python coding

Tariq Khasiri tariqkhasiri at gmail.com
Wed Mar 29 10:19:14 EDT 2023


apologies for not stating my goal of this coding clearly in my last email

The goal is to find the distance from each county to the *nearest
intersection of interstate highways*

On Wed, 29 Mar 2023 at 08:51, Tariq Khasiri <tariqkhasiri at gmail.com> wrote:

> After following the steps
>
> import geopandas as gpd
> from shapely.geometry import Point
>
> # Load shapefiles
> counties = gpd.read_file('county.shp')
> interstates = gpd.read_file('roads.shp')
>
> # To resolve this warning, you can reproject one of the datasets to match
> the CRS of the other. In this case, you could reproject the counties
> dataset to match the CRS of the interstates dataset.
>
> counties = counties.to_crs(interstates.crs)
>
> # Perform spatial join to find nearest interstate for each county
> joined = gpd.sjoin_nearest(interstates, counties, distance_col='distance')
>
> #creating new data frame with necessary columns from the joined dataframe
> of total 27 columns
> new_joined = joined.loc[:, ['LINEARID', 'FULLNAME', 'RTTYP', 'geometry',
> 'index_right', 'statefp', 'countyfp', 'distance']]
>
> print(new_joined.sample(10))
>
> my data frames (10 observations) look like this
>
> Now the rest of the code looks like this. But, I'm not sure how I need to
> edit my code with the columns of new_joined data frame. On top of it I
> need observation of the interstate highways. So from RTTYP column anything
> that starts with *I- *, that's my needed observation. The rest of it is
> not necessary since they are observations of highways or expressways. Only
> need the interstate highway given my goal.
>
> ## the rest of the code needs to be edited
>
> # Calculate distance between each county and its nearest interstate
> def calculate_distance(row):
>     county_point = row.geometry.x, row.geometry.y
>     interstate_point = row.geometry_nearest.x, row.geometry_nearest.y
>     return Point(county_point).distance(Point(interstate_point))
>
> joined['distance'] = joined.apply(calculate_distance, axis=1)
>
> # Save results to file
> joined.to_file('output.shp')
>
> ##new_joined.sample 10 observation is in the following section
>
>  LINEARID           FULLNAME RTTYP  \
> 11187  1104375149380          US Hwy 30     U
> 658    1105056901139              I- 96     I
> 4070    110465934760  Gerald R Ford Fwy     M
> 8925   1102218644426         Edens Expy     M
> 1254   1104492838322          US Hwy 23     U
> 3514   1101476091588              I- 90     I
> 3979    110431486317   Will Rogers Tpke     M
> 12155  1104762432455              I- 40     I
> 10332  1105088962603              I- 80     I
> 12086  1104755971958              I- 94     I
>
>                                                 geometry  index_right statefp  \
> 11187  LINESTRING (-105.52431 41.28870, -105.52369 41...         2773      56
> 658    LINESTRING (-83.08258 42.31966, -83.08260 42.3...          271      26
> 4070   LINESTRING (-85.78228 42.87707, -85.78357 42.8...         1083      26
> 8925   LINESTRING (-87.74634 41.96272, -87.74642 41.9...         1189      17
> 1254   LINESTRING (-83.99342 34.08687, -83.99330 34.0...          427      13
> 3514   LINESTRING (-75.97591 43.09333, -75.97613 43.0...           82      36
> 3979   LINESTRING (-95.27075 36.51022, -95.27031 36.5...         1474      40
> 12155  LINESTRING (-94.31317 35.45632, -94.31351 35.4...         1337      05
> 10332  LINESTRING (-121.54758 38.59867, -121.54713 38...          871      06
> 12086  LINESTRING (-96.28076 46.48266, -96.27988 46.4...          555      27
>
>       countyfp  distance
> 11187      001       0.0
> 658        163       0.0
> 4070       139       0.0
> 8925       031       0.0
> 1254       135       0.0
> 3514       067       0.0
> 3979       035       0.0
> 12155      033       0.0
> 10332      113       0.0
>
>
> On Tue, 28 Mar 2023 at 09:43, ThreeBlindQuarks <threesomequarks at proton.me>
> wrote:
>
>> Tariq,
>>
>> Your entire program is written in a module called geopandas so to even
>> understand it, we would need to find out what exactly various functions in
>> that module you call are documented to do.
>>
>> A little more explanation of what you are doing might be helpful.
>>
>> Near as I can tell, you are reading in two files to make two structures
>> in a pandas format. You join them based on some common columns in whatever
>> manner gpd.sjoin_nearest() does so that each row now contains enough info
>> for the next step.
>>
>> You create a function that accepts one such row and extracts the
>> coordinates of two parts and calculates a distance that I am guessing is a
>> Euclidean distance as the crow flies, not as you drive on roads.
>>
>> You use that function on every row to create a new column.
>>
>> Then you save the results in a file.
>>
>> Does that do something? Who knows. The devil is in the details. The
>> mysterious black box that does the initial join would have to uniquely
>> match up counties and highways in a way that generates exactly the rows
>> needed and exclude any that don't. We (meaning ME) have no assurance of
>> what happens in there. Could you end up comparing all counties with their
>> distance from somewhere on highway 1?
>>
>> - Q
>>
>>
>> Sent with Proton Mail secure email.
>>
>> ------- Original Message -------
>> On Monday, March 27th, 2023 at 10:58 PM, Tariq Khasiri <
>> tariqkhasiri at gmail.com> wrote:
>>
>>
>> > Hello everyone,
>> >
>> > For a particular project, I need to collect the distance data from each
>> US
>> > county to the nearest intersection of interstate highway.
>> >
>> > These are the publicly provided dataset where anyone has access to the
>> > county shp files of usa and roads shp files.
>> >
>> > this one for county polygons
>> >
>> https://public.opendatasoft.com/explore/dataset/us-county-boundaries/information/
>> >
>> > this one for primary roads
>> >
>> >
>> https://catalog.data.gov/dataset/tiger-line-shapefile-2016-nation-u-s-primary-roads-national-shapefile/resource/d983eeef-21d9-4367-9d42-27a131ee72b8
>> >
>> > `import geopandas as gpd from shapely.geometry import Point # Load
>> shapefiles counties = gpd.read_file('path/to/counties.shp') interstates =
>> gpd.read_file('path/to/interstates.shp') # Perform spatial join to find
>> nearest interstate for each county joined = gpd.sjoin_nearest(counties,
>> interstates) # Calculate distance between each county and its nearest
>> interstate def calculate_distance(row): county_point = row.geometry.x,
>> row.geometry.y interstate_point = row.geometry_nearest.x,
>> row.geometry_nearest.y return
>> Point(county_point).distance(Point(interstate_point)) joined['distance'] =
>> joined.apply(calculate_distance, axis=1) # Save results to file
>> joined.to_file('path/to/output.shp')`
>> >
>> > do you think I can execute my goal successfully with the code snippet
>> above
>> > ??
>> > _______________________________________________
>> > Tutor maillist - Tutor at python.org
>> > To unsubscribe or change subscription options:
>> > https://mail.python.org/mailman/listinfo/tutor
>>
>


More information about the Tutor mailing list