Answer a question

Two part question -- First, when running the code without final if statement, I'm not getting all of the HREF tags... I see many more links in Inspector that don't seem to come through.

Looking for a fix, but also trying to understand general knowledge on this - is there a reason why some links would work and others would not?

Similarly, I wanted to pull the HREF tags that contain "Surf-Report". I've used this code with p.startswith, and it works... but I couldn't find what the function call would be to say "contains".

I'm new to all of this, looking but don't fully understand either of these.

import requests
from bs4 import BeautifulSoup

profiles = []
urls = [
    'https://magicseaweed.com/New-Jersey-Monmouth-County-Surfing/277/',
    'https://magicseaweed.com/New-Jersey-Ocean-County-Surfing/278/'
]
for url in urls:
    req = requests.get(url)
    soup = BeautifulSoup(req.text, 'html.parser')
    for profile in soup.find_all('a'):

        profile = profile.get('href')

        profiles.append(profile)

# print(profiles)

for p in profiles:
    if p.contains('Surf-Report'):
        print(p)

For context, my overall goal is to go to these different county pages, and get all of the HREF tags there. Once I have those, I want to visit each individual link and pull the wave sizes from each of the links stored there.

I'm looking to build a way to monitor all waves in New Jersey daily... no purpose, just a fun practice project with something I find interesting.

Answers

Those urls in page appears to be fed into dynamically, via an (or more?) XHR call. Upon a brief inspection of that page' Dev tools - network tab, I noticed a call to an api (from which I stripped the variables). Scraping that api returns over 8k results:

import requests
import pandas as pd
import json

r = requests.get('https://magicseaweed.com/api/mdkey/spot?&limit=-1')
df = pd.DataFrame(r.json())
print(df)

Result:

_id

_obj

_path

name

description

lat

lon

dataLat

dataLon

surfAreaId

dataSpotId

url

multiplier

optimumSwellAngle

optimumWindAngle

timezone

offset

modelName

isBigWave

ratingType

timeZoneAbbr

hasAdvancedForecast

proteusDataId

proteusResolution

surflineSpotId

defaultModelId

topLevelNav

tidalPort

isDataSpot

favouriteCount

mapImageUrl

breakingWaveModelId

weatherModel

added

hidden

edited

pointOfInterestId

useSDS

0

1

Spot

Spot

Newquay - Fistral North

50.4184

-5.0997

50.42

-5.08

7

nan

/Newquay-Fistral-North-Surf-Report/1/

0.7

290

110

Europe/London

3600

glo_30m

False

directional

BST

True

nan

UK_4m

584204214e65fad6a7709cec

42

True

True

0

https://chart-1.msw.ms/maps/spot/2576f3cfb35dba07a84590141d54d3a5.png

nan

gfs.0p25

-62169984000

False

1617982527

c10396fc-ed41-4771-8e8e-ab8dbff5c67c

True

1

2

Spot

Spot

Porthtowan

50.2891

-5.2461

50.27

-5.3

6

nan

/Porthtowan-Surf-Report/2/

0.8

290

110

Europe/London

3600

glo_30m

False

directional

BST

True

nan

GLOB_30m

5842041f4e65fad6a7708c98

38

True

True

0

https://chart-3.msw.ms/maps/spot/d278b42dc4a8adc983a24e2c04333665.png

nan

gfs.0p25

-62169984000

False

1617982527

39bca112-f093-4a7b-90eb-b7993920e5c4

True

2

3

Spot

Spot

Gwithian

50.2235

-5.399

50.2

-5.5

6

nan

/Gwithian-Surf-Report/3/

0.5

285

105

Europe/London

3600

glo_30m

False

directional

BST

True

nan

GLOB_30m

5842041f4e65fad6a7708c95

38

True

Perranporth

True

0

https://chart-5.msw.ms/maps/spot/2a4608d0e793ee20f4566ca85f5ba6cd.png

nan

gfs.0p25

-62169984000

False

1617982527

6b0785be-1efb-413d-a5a9-ba2133c6ef68

True

3

4

Spot

Spot

Sennen

50.0802

-5.6976

50.07

-5.7

6

nan

/Sennen-Surf-Report/4/

0.8

270

90

Europe/London

3600

glo_30m

False

directional

BST

True

nan

GLOB_30m

5842041f4e65fad6a7708c97

38

True

True

0

https://chart-4.msw.ms/maps/spot/c1be3fe6871d15e4ea5297193b8b81da.png

nan

gfs.0p25

-62169984000

False

1617982527

a641e633-8692-4d4b-b2d6-c4e1d4132c9b

True

4

5

Spot

Spot

Constantine

50.5333

-5.0221

50.5759

-4.92239

8

nan

/Constantine-Surf-Report/5/

1

270

90

Europe/London

3600

glo_30m

False

directional

BST

True

nan

GLOB_30m

584204204e65fad6a77090b3

38

True

True

0

https://chart-3.msw.ms/maps/spot/47b00f609d5e46cda66040d8b811bae6.png

nan

gfs.0p25

-62169984000

False

1617982527

1daacdd5-a92a-4f7c-bc7c-af30a392ef7d

True

5

6

Spot

Spot

Bude - Crooklets

50.8358

-4.5548

50.8336

-4.56057

8

nan

/Bude-Crooklets-Surf-Report/6/

1

270

90

Europe/London

3600

glo_30m

False

directional

BST

True

nan

GLOB_30m

5842041f4e65fad6a7708ca5

38

True

True

0

https://chart-1.msw.ms/maps/spot/553d3a850372eee8b10d13d23cbdb78e.png

nan

gfs.0p25

-62169984000

False

1617982527

6cb522d3-a781-45ae-83cd-fcc941fd47cb

True

6

7

Spot

Spot

Croyde Beach

51.1302

-4.2435

51.1449

-4.25995

9

nan

/Croyde-Beach-Surf-Report/7/

0.8

270

90

Europe/London

3600

glo_30m

False

directional

BST

True

nan

GLOB_30m

5842041f4e65fad6a7708ca4

38

True

Ilfracombe, England

True

0

https://chart-3.msw.ms/maps/spot/0f967e1e6130e9cb1b2623aafe966b58.png

nan

gfs.0p25

-62169984000

False

1617982527

2dca4454-5789-4be3-808e-f512fef45dc3

True

7

8

Spot

Spot

Praa Sands

50.103

-5.391

50

-3.87

5

nan

/Praa-Sands-Surf-Report/8/

0.8

210

30

Europe/London

3600

glo_30m

False

directional

BST

True

nan

GLOB_30m

5842041f4e65fad6a7708c9a

38

True

True

0

https://chart-4.msw.ms/maps/spot/aea8da3ce8bd22228c07c79db8e9b8de.png

nan

gfs.0p25

-62169984000

False

1617982527

a166dde9-2d1c-4a55-bb34-5a6efce93986

True

8

9

Spot

Spot

Whitsand Bay

50.3387

-4.2434

50.3334

-4.2433

5

nan

/Whitsand-Bay-Surf-Report/9/

0.7

225

45

Europe/London

3600

glo_30m

False

directional

BST

True

nan

UK_4m

584204204e65fad6a77090c5

42

True

True

0

https://chart-3.msw.ms/maps/spot/1fe1f342742ba3cf7dd3f8d9943948cc.png

nan

gfs.0p25

-62169984000

False

1617982527

c6b42c46-7db3-4e53-8f38-b28db957b4e7

True

9

10

Spot

Spot

Bantham

50.2787

-3.8885

50

-3.87

5

nan

/Bantham-Surf-Report/10/

0.8

230

65

Europe/London

3600

glo_30m

False

directional

BST

True

2

UK_4m

584204204e65fad6a77090c9

42

True

River Yealm

True

0

https://chart-1.msw.ms/maps/spot/358c02090c0c31888fee4794b39d397c.png

nan

gfs.0p25

-62169984000

False

1646829186

d3566d34-b58d-4803-8cf2-3e3dc5fc1a48

True

Is this what you're after?

Logo

Python社区为您提供最前沿的新闻资讯和知识内容

更多推荐