一尘不染

如何访问结构化 json-ld 或微数据列表中的元素?

python

我正在尝试通过 Python 访问 json-ld 列表/字典中的项目。

如果可能,我想知道是否有产品,如果有,该产品的 URL、价格和可用性是什么。

信息就在元数据中。如何访问?

data = {'json-ld': [{'@context': 'http://schema.org',
          '@type': 'WebSite',
          'potentialAction': {'@type': 'SearchAction',
                              'query-input': 'required '
                                             'name=search_term_string',
                              'target': 'https://www.vitenda.de/search/result?term={search_term_string}'},
          'url': 'https://www.vitenda.de'},
         {'@context': 'http://schema.org',
          '@type': 'Organization',
          'logo': 'https://www.vitenda.de/documents/logo/logo_vitenda_02_646.png',
          'url': 'https://www.vitenda.de'},
         {'@context': 'http://schema.org/',
          '@type': 'BreadcrumbList',
          'itemListElement': [{'@type': 'ListItem',
                               'item': {'@id': 'https://www.vitenda.de/search',
                                        'name': 'Artikelsuche'},
                               'position': 1},
                              {'@type': 'ListItem',
                               'item': {'@id': '',
                                        'name': 'Ihre Suchergebnisse für '
                                                "<b>'11287708'</b> (1 "
                                                'Produkte)'},
                               'position': 2}]},
         {'@context': 'http://schema.org/',
          '@type': 'Product',
          'brand': {'@type': 'Organization', 'name': 'ALIUD Pharma GmbH'},
          'description': '',
          'gtin': '',
          'image': 'https://cdn1.apopixx.de/300/web_schraeg_png/11287708.png?ver=1649058520',
          'itemCondition': 'https://schema.org/NewCondition',
          'name': 'GINKGO AL 240 mg Filmtabletten',
          'offers': {'@type': 'Offer',
                     'availability': 'http://schema.org/InStock',
                     'deliveryLeadTime': {'@type': 'QuantitativeValue',
                                          'minValue': '3'},
                     'price': 96.36,
                     'priceCurrency': 'EUR',
                     'priceValidUntil': '19-06-2022 18:41:54',
                     'url': 'https://www.vitenda.de/ginkgo-al-240-mg-filmtabletten.11287708'},
          'productID': '11287708',
          'sku': '11287708',
          'url': 'https://www.vitenda.de/ginkgo-al-240-mg-filmtabletten.11287708'}]}



if '@context' in data['json-ld'][0]['@context']:
    print('yes')
else:
    print('no')

print(data['json-ld'][3])

阅读 109

收藏
2022-06-13

共1个答案

一尘不染

似乎产品有@type一个值为 的键Product,所以如果我们过滤掉这些字典并对其进行迭代,我们可以完成你想要的:

products = list(filter(lambda d: d.get('@type') == 'Product', data['json-ld']))
print(f'Found {len(products)} product{"s" if len(products) != 1 else ""}:')

for product in products:
    name = product['name']
    offers = product.get('offers', {})
    available = 'InStock' in offers.get('availability', '')
    price = f'{offers["price"]:.2f} {offers["priceCurrency"]}' if available else 'not available'
    url = product['url']
    print(f'{name} ({price}), {url}')

if not products:
    print('No products found')

输出:

Found 1 product:
GINKGO AL 240 mg Filmtabletten (96.36 EUR), https://www.vitenda.de/ginkgo-al-240-mg-filmtabletten.11287708
2022-06-13