apify-sdk-python
Index
Classes
Methods
- __delitem__
- __get__
- __getitem__
- __init__
- __init__
- __init__
- __init__
- __init__
- __init__
- __init__
- __init__
- __init__
- __init__
- __init__
- __iter__
- __len__
- __setitem__
- add_request
- close
- dataset
- datasets
- delete
- delete
- delete
- delete_record
- delete_request
- emit
- get
- get
- get
- get
- get_items_as_bytes
- get_or_create
- get_or_create
- get_or_create
- get_or_create
- get_record
- get_record_as_bytes
- get_request
- get_storage_client
- init
- items
- iterate_items
- key_value_store
- key_value_stores
- list
- list
- list
- list
- list_head
- list_items
- list_keys
- off
- on
- open
- push_items
- request_queue
- request_queues
- set_cloud_client
- set_config
- set_record
- stream_items
- stream_record
- update
- update
- update
- update_request
- values
- wait_for_all_listeners_to_complete
Properties
- __version__
- API_PROCESSED_REQUESTS_DELAY_MILLIS
- APIFY_PROXY_VALUE_REGEX
- BASE64_REGEXP
- BaseResourceClientType
- BaseResourceCollectionClientType
- BOOL_ENV_VARS
- COUNTRY_CODE_REGEX
- DATETIME_ENV_VARS
- DEFAULT_API_PARAM_LIMIT
- DEFAULT_LOCAL_FILE_EXTENSION
- DualPropertyOwner
- DualPropertyType
- EFFECTIVE_LIMIT_BYTES
- ENCRYPTED_INPUT_VALUE_PREFIX
- ENCRYPTED_INPUT_VALUE_REGEXP
- ENCRYPTION_AUTH_TAG_LENGTH
- ENCRYPTION_IV_LENGTH
- ENCRYPTION_KEY_LENGTH
- EVENT_LISTENERS_TIMEOUT_SECS
- FLOAT_ENV_VARS
- ImplementationType
- INTEGER_ENV_VARS
- IterateKeysInfo
- IterateKeysTuple
- JSONSerializable
- LIST_ITEMS_LIMIT
- ListOrDictOrAny
- LOCAL_ENTRY_NAME_DIGITS
- logger
- logger_name
- MainReturnType
- MAX_CACHED_REQUESTS
- MAX_PAYLOAD_SIZE_BYTES
- MAX_QUERIES_FOR_CONSISTENCY
- MetadataType
- PARSE_DATE_FIELDS_KEY_SUFFIX
- PARSE_DATE_FIELDS_MAX_DEPTH
- QUERY_HEAD_BUFFER
- QUERY_HEAD_MIN_LENGTH
- RECENTLY_HANDLED_CACHE_SIZE
- REQUEST_ID_LENGTH
- REQUEST_QUEUE_HEAD_MAX_LIMIT
- ResourceClientType
- SAFETY_BUFFER_PERCENT
- SESSION_ID_MAX_LENGTH
- STORAGE_CONSISTENCY_DELAY_MILLIS
- STRING_ENV_VARS
- T
- T
Constants
Storages
Storage data
Event managers
Events
Event data
Storage clients
Request loaders
Methods
__delitem__
Parameters
key: str
Returns None
__get__
Call the getter with the right object.
Parameters
optionalobj: Optional[DualPropertyOwner]
The instance of class T on which the getter will be called
owner: Type[DualPropertyOwner]
The class object of class T on which the getter will be called, if obj is None
Returns DualPropertyType
The result of the getter.
__getitem__
Get an item from the cache. Move it to the end if present.
Parameters
key: str
Returns T
__init__
Crate an instance of EventManager.
Parameters
config: Configuration
The actor configuration to be used in this event manager.
Returns None
__init__
Initialize the dualproperty.
Parameters
getter: Callable[..., DualPropertyType]
The getter of the property. It should accept either an instance or a class as its first argument.
Returns None
__init__
Create a LRUCache with a specific max_length.
Parameters
max_length: int
Returns None
__init__
Initialize the MemoryStorageClient.
Parameters
optionalkeyword-onlylocal_data_directory: Optional[str] = None
A local directory where all data will be persisted
optionalkeyword-onlywrite_metadata: Optional[bool] = None
Whether to persist metadata of the storages as well
optionalkeyword-onlypersist_storage: Optional[bool] = None
Whether to persist the data to the
local_data_directoryor just keep them in memory
Returns None
__init__
Initialize the DatasetCollectionClient with the passed arguments.
Parameters
keyword-onlybase_storage_directory: str
keyword-onlymemory_storage_client: MemoryStorageClient
Returns None
__init__
Initialize the DatasetClient.
Parameters
keyword-onlybase_storage_directory: str
keyword-onlymemory_storage_client: MemoryStorageClient
optionalkeyword-onlyid: Optional[str] = None
optionalkeyword-onlyname: Optional[str] = None
Returns None
__init__
Initialize the KeyValueStoreClient.
Parameters
keyword-onlybase_storage_directory: str
keyword-onlymemory_storage_client: MemoryStorageClient
optionalkeyword-onlyid: Optional[str] = None
optionalkeyword-onlyname: Optional[str] = None
Returns None
__init__
Initialize the RequestQueueClient.
Parameters
keyword-onlybase_storage_directory: str
keyword-onlymemory_storage_client: MemoryStorageClient
optionalkeyword-onlyid: Optional[str] = None
optionalkeyword-onlyname: Optional[str] = None
Returns None
__init__
Initialize the BaseResourceClient.
Parameters
keyword-onlybase_storage_directory: str
keyword-onlymemory_storage_client: MemoryStorageClient
optionalkeyword-onlyid: Optional[str] = None
optionalkeyword-onlyname: Optional[str] = None
Returns None
__init__
Create a
StorageClientManagerinstance.Returns None
__init__
Initialize the storage.
Do not use this method directly, but use
Actor.open_<STORAGE>()instead.Parameters
id: str
The storage id
optionalname: Optional[str]
The storage name
client: Union[ApifyClientAsync, MemoryStorageClient]
The storage client
config: Configuration
The configuration
Returns Undefined
__iter__
Iterate over the keys of the cache in order of insertion.
Returns Iterator[str]
__len__
Get the number of items in the cache.
Returns int
__setitem__
Add an item to the cache. Remove least used item if max_length exceeded.
Parameters
key: str
value: T
Returns None
add_request
Add a request to the queue.
Parameters
request: Dict
The request to add to the queue
optionalkeyword-onlyforefront: Optional[bool] = None
Whether to add the request to the head or the end of the queue
Returns Dict
dict: The added request.
close
Initialize the event manager.
This will stop listening for the platform events, and it will wait for all the event listeners to finish.
Parameters
optionalevent_listeners_timeout_secs: Optional[float] = None
Optional timeout after which the pending event listeners are canceled.
Returns None
dataset
Retrieve the sub-client for manipulating a single dataset.
Parameters
dataset_id: str
ID of the dataset to be manipulated
Returns DatasetClient
datasets
Retrieve the sub-client for manipulating datasets.
Returns DatasetCollectionClient
delete
Delete the dataset.
Returns None
delete
Delete the key-value store.
Returns None
delete
Delete the request queue.
Returns None
delete_record
Delete the specified record from the key-value store.
Parameters
key: str
The key of the record which to delete
Returns None
delete_request
Delete a request from the queue.
Parameters
request_id: str
ID of the request to delete.
Returns None
emit
Emit an actor event manually.
Parameters
event_name: ActorEventTypes
The actor event which should be emitted.
data: Any
The data that should be emitted with the event.
Returns None
get
Retrieve the dataset.
Returns Optional[Dict]
dict, optional: The retrieved dataset, or None, if it does not exist
get
Retrieve the key-value store.
Returns Optional[Dict]
dict, optional: The retrieved key-value store, or None if it does not exist
get
Retrieve the request queue.
Returns Optional[Dict]
dict, optional: The retrieved request queue, or None, if it does not exist
get
Retrieve the storage.
Returns Optional[Dict]
dict, optional: The retrieved storage, or None, if it does not exist
get_items_as_bytes
Parameters
_args: Any
_kwargs: Any
Returns bytes
get_or_create
Retrieve a named key-value store, or create a new one when it doesn't exist.
Parameters
optionalkeyword-onlyname: Optional[str] = None
The name of the key-value store to retrieve or create.
optionalkeyword-onlyschema: Optional[Dict] = None
The schema of the key-value store
optionalkeyword-only_id: Optional[str] = None
Returns Dict
dict: The retrieved or newly-created key-value store.
get_or_create
Retrieve a named storage, or create a new one when it doesn't exist.
Parameters
optionalkeyword-onlyname: Optional[str] = None
The name of the storage to retrieve or create.
optionalkeyword-onlyschema: Optional[Dict] = None
The schema of the storage
optionalkeyword-only_id: Optional[str] = None
Returns Dict
dict: The retrieved or newly-created storage.
get_or_create
Retrieve a named request queue, or create a new one when it doesn't exist.
Parameters
optionalkeyword-onlyname: Optional[str] = None
The name of the request queue to retrieve or create.
optionalkeyword-onlyschema: Optional[Dict] = None
The schema of the request queue
optionalkeyword-only_id: Optional[str] = None
Returns Dict
dict: The retrieved or newly-created request queue.
get_or_create
Retrieve a named dataset, or create a new one when it doesn't exist.
Parameters
optionalkeyword-onlyname: Optional[str] = None
The name of the dataset to retrieve or create.
optionalkeyword-onlyschema: Optional[Dict] = None
The schema of the dataset
optionalkeyword-only_id: Optional[str] = None
Returns Dict
dict: The retrieved or newly-created dataset.
get_record
Retrieve the given record from the key-value store.
Parameters
key: str
Key of the record to retrieve
Returns Optional[Dict]
dict, optional: The requested record, or None, if the record does not exist
get_record_as_bytes
Retrieve the given record from the key-value store, without parsing it.
Parameters
key: str
Key of the record to retrieve
Returns Optional[Dict]
dict, optional: The requested record, or None, if the record does not exist
get_request
Retrieve a request from the queue.
Parameters
request_id: str
ID of the request to retrieve
Returns Optional[Dict]
dict, optional: The retrieved request, or None, if it did not exist.
get_storage_client
Get the current storage client instance.
Parameters
optionalforce_cloud: bool = False
Returns Union[ApifyClientAsync, MemoryStorageClient]
ApifyClientAsync or MemoryStorageClient: The current storage client instance.
init
Initialize the event manager.
When running this on the Apify Platform, this will start processing events send by the platform to the events websocket and emitting them as events that can be listened to by the
Actor.on()method.Returns None
items
Iterate over the pairs of (key, value) in the cache in order of insertion.
Returns ItemsView[str, T]
iterate_items
Iterate over the items in the dataset.
Parameters
optionalkeyword-onlyoffset: int = 0
Number of items that should be skipped at the start. The default value is 0
optionalkeyword-onlylimit: Optional[int] = None
Maximum number of items to return. By default there is no limit.
optionalkeyword-onlyclean: Optional[bool] = None
If True, returns only non-empty items and skips hidden fields (i.e. fields starting with the # character). The clean parameter is just a shortcut for skip_hidden=True and skip_empty=True parameters. Note that since some objects might be skipped from the output, that the result might contain less items than the limit value.
optionalkeyword-onlydesc: Optional[bool] = None
By default, results are returned in the same order as they were stored. To reverse the order, set this parameter to True.
optionalkeyword-onlyfields: Optional[List[str]] = None
A list of fields which should be picked from the items, only these fields will remain in the resulting record objects. Note that the fields in the outputted items are sorted the same way as they are specified in the fields parameter. You can use this feature to effectively fix the output format.
optionalkeyword-onlyomit: Optional[List[str]] = None
A list of fields which should be omitted from the items.
optionalkeyword-onlyunwind: Optional[str] = None
Name of a field which should be unwound. If the field is an array then every element of the array will become a separate record and merged with parent object. If the unwound field is an object then it is merged with the parent object. If the unwound field is missing or its value is neither an array nor an object and therefore cannot be merged with a parent object, then the item gets preserved as it is. Note that the unwound items ignore the desc parameter.
optionalkeyword-onlyskip_empty: Optional[bool] = None
If True, then empty items are skipped from the output. Note that if used, the results might contain less items than the limit value.
optionalkeyword-onlyskip_hidden: Optional[bool] = None
If True, then hidden fields are skipped from the output, i.e. fields starting with the # character.
Returns AsyncIterator[Dict]
key_value_store
Retrieve the sub-client for manipulating a single key-value store.
Parameters
key_value_store_id: str
ID of the key-value store to be manipulated
Returns KeyValueStoreClient
key_value_stores
Retrieve the sub-client for manipulating key-value stores.
Returns KeyValueStoreCollectionClient
list
List the available key-value stores.
Returns ListPage
ListPage: The list of available key-value stores matching the specified filters.
list
List the available storages.
Returns ListPage
ListPage: The list of available storages matching the specified filters.
list
List the available request queues.
Returns ListPage
ListPage: The list of available request queues matching the specified filters.
list
List the available datasets.
Returns ListPage
ListPage: The list of available datasets matching the specified filters.
list_head
Retrieve a given number of requests from the beginning of the queue.
Parameters
optionalkeyword-onlylimit: Optional[int] = None
How many requests to retrieve
Returns Dict
dict: The desired number of requests from the beginning of the queue.
list_items
List the items of the dataset.
Parameters
optionalkeyword-onlyoffset: Optional[int] = 0
Number of items that should be skipped at the start. The default value is 0
optionalkeyword-onlylimit: Optional[int] = LIST_ITEMS_LIMIT
Maximum number of items to return. By default there is no limit.
optionalkeyword-onlyclean: Optional[bool] = None
If True, returns only non-empty items and skips hidden fields (i.e. fields starting with the # character). The clean parameter is just a shortcut for skip_hidden=True and skip_empty=True parameters. Note that since some objects might be skipped from the output, that the result might contain less items than the limit value.
optionalkeyword-onlydesc: Optional[bool] = None
By default, results are returned in the same order as they were stored. To reverse the order, set this parameter to True.
optionalkeyword-onlyfields: Optional[List[str]] = None
A list of fields which should be picked from the items, only these fields will remain in the resulting record objects. Note that the fields in the outputted items are sorted the same way as they are specified in the fields parameter. You can use this feature to effectively fix the output format.
optionalkeyword-onlyomit: Optional[List[str]] = None
A list of fields which should be omitted from the items.
optionalkeyword-onlyunwind: Optional[str] = None
Name of a field which should be unwound. If the field is an array then every element of the array will become a separate record and merged with parent object. If the unwound field is an object then it is merged with the parent object. If the unwound field is missing or its value is neither an array nor an object and therefore cannot be merged with a parent object, then the item gets preserved as it is. Note that the unwound items ignore the desc parameter.
optionalkeyword-onlyskip_empty: Optional[bool] = None
If True, then empty items are skipped from the output. Note that if used, the results might contain less items than the limit value.
optionalkeyword-onlyskip_hidden: Optional[bool] = None
If True, then hidden fields are skipped from the output, i.e. fields starting with the # character.
optionalkeyword-onlyflatten: Optional[List[str]] = None
A list of fields that should be flattened
optionalkeyword-onlyview: Optional[str] = None
Name of the dataset view to be used
Returns ListPage
ListPage: A page of the list of dataset items according to the specified filters.
list_keys
List the keys in the key-value store.
Parameters
optionalkeyword-onlylimit: int = DEFAULT_API_PARAM_LIMIT
Number of keys to be returned. Maximum value is 1000
optionalkeyword-onlyexclusive_start_key: Optional[str] = None
All keys up to this one (including) are skipped from the result
Returns Dict
dict: The list of keys in the key-value store matching the given arguments
off
Remove a listener, or all listeners, from an actor event.
Parameters
event_name: ActorEventTypes
The actor event for which to remove listeners.
optionallistener: Optional[Callable] = None
The listener which is supposed to be removed. If not passed, all listeners of this event are removed.
Returns None
on
Add an event listener to the event manager.
Parameters
event_name: ActorEventTypes
The actor event for which to listen to.
listener: Callable
The function which is to be called when the event is emitted (can be async).
Returns Callable
open
Open a storage, or return a cached storage object if it was opened before.
Opens a storage with the given ID or name. Returns the cached storage object if the storage was opened before.
Parameters
optionalkeyword-onlyid: Optional[str] = None
ID of the storage to be opened. If neither
idnornameare provided, the method returns the default storage associated with the actor run. If the storage with the given ID does not exist, it raises an error.optionalkeyword-onlyname: Optional[str] = None
Name of the storage to be opened. If neither
idnornameare provided, the method returns the default storage associated with the actor run. If the storage with the given name does not exist, it is created.optionalkeyword-onlyforce_cloud: bool = False
If set to True, it will open a storage on the Apify Platform even when running the actor locally. Defaults to False.
optionalkeyword-onlyconfig: Optional[Configuration] = None
A
Configurationinstance, uses global configuration if omitted.
Returns Self
An instance of the storage.
push_items
Push items to the dataset.
Parameters
items: JSONSerializable
The items which to push in the dataset. Either a stringified JSON, a dictionary, or a list of strings or dictionaries.
Returns None
request_queue
Retrieve the sub-client for manipulating a single request queue.
Parameters
request_queue_id: str
ID of the request queue to be manipulated
optionalkeyword-onlyclient_key: Optional[str] = None
A unique identifier of the client accessing the request queue
Returns RequestQueueClient
request_queues
Retrieve the sub-client for manipulating request queues.
Returns RequestQueueCollectionClient
set_cloud_client
Set the storage client.
Parameters
client: ApifyClientAsync
The instance of a storage client.
Returns None
set_config
Set the config for the StorageClientManager.
Parameters
config: Configuration
The configuration this StorageClientManager should use.
Returns None
set_record
Set a value to the given record in the key-value store.
Parameters
key: str
The key of the record to save the value to
value: Any
The value to save into the record
optionalcontent_type: Optional[str] = None
The content type of the saved value
Returns None
stream_items
Parameters
_args: Any
_kwargs: Any
Returns AsyncIterator
stream_record
Parameters
_key: str
Returns AsyncIterator[Optional[Dict]]
update
Update the dataset with specified fields.
Parameters
optionalkeyword-onlyname: Optional[str] = None
The new name for the dataset
Returns Dict
dict: The updated dataset
update
Update the key-value store with specified fields.
Parameters
optionalkeyword-onlyname: Optional[str] = None
The new name for key-value store
Returns Dict
dict: The updated key-value store
update
Update the request queue with specified fields.
Parameters
optionalkeyword-onlyname: Optional[str] = None
The new name for the request queue
Returns Dict
dict: The updated request queue
update_request
Update a request in the queue.
Parameters
request: Dict
The updated request
optionalkeyword-onlyforefront: Optional[bool] = None
Whether to put the updated request in the beginning or the end of the queue
Returns Dict
dict: The updated request
values
Iterate over the values in the cache in order of insertion.
Returns ValuesView[T]
wait_for_all_listeners_to_complete
Wait for all event listeners which are currently being executed to complete.
Parameters
optionalkeyword-onlytimeout_secs: Optional[float] = None
Timeout for the wait. If the event listeners don't finish until the timeout, they will be canceled.
Returns None
Properties
__version__
API_PROCESSED_REQUESTS_DELAY_MILLIS
APIFY_PROXY_VALUE_REGEX
BASE64_REGEXP
BaseResourceClientType
BaseResourceCollectionClientType
BOOL_ENV_VARS
COUNTRY_CODE_REGEX
DATETIME_ENV_VARS
DEFAULT_API_PARAM_LIMIT
DEFAULT_LOCAL_FILE_EXTENSION
DualPropertyOwner
DualPropertyType
EFFECTIVE_LIMIT_BYTES
ENCRYPTED_INPUT_VALUE_PREFIX
ENCRYPTED_INPUT_VALUE_REGEXP
ENCRYPTION_AUTH_TAG_LENGTH
ENCRYPTION_IV_LENGTH
ENCRYPTION_KEY_LENGTH
EVENT_LISTENERS_TIMEOUT_SECS
FLOAT_ENV_VARS
ImplementationType
INTEGER_ENV_VARS
IterateKeysInfo
IterateKeysTuple
JSONSerializable
LIST_ITEMS_LIMIT
ListOrDictOrAny
LOCAL_ENTRY_NAME_DIGITS
logger
logger_name
MainReturnType
MAX_CACHED_REQUESTS
MAX_PAYLOAD_SIZE_BYTES
9MB
MAX_QUERIES_FOR_CONSISTENCY
MetadataType
PARSE_DATE_FIELDS_KEY_SUFFIX
PARSE_DATE_FIELDS_MAX_DEPTH
QUERY_HEAD_BUFFER
QUERY_HEAD_MIN_LENGTH
RECENTLY_HANDLED_CACHE_SIZE
REQUEST_ID_LENGTH
REQUEST_QUEUE_HEAD_MAX_LIMIT
ResourceClientType
SAFETY_BUFFER_PERCENT
0.01%
Remove an item from the cache.