Picked up JAVA_TOOL_OPTIONS: -Djdk.jar.maxSignatureFileSize=2147483639 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hdp/5.2.20240509.1/spark3/jars/log4j-slf4j-impl-2.17.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/5.2.20240509.1/spark3/jars/spark-streaming-kafka-0-10-assembly_2.12-3.3.1.5.2.20240509.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] 2024-06-12 07:04:22,146 INFO SignalUtils [main]: Registering signal handler for TERM 2024-06-12 07:04:22,292 INFO SignalUtils [main]: Registering signal handler for HUP 2024-06-12 07:04:22,292 INFO SignalUtils [main]: Registering signal handler for INT 2024-06-12 07:04:22,466 INFO SecurityManager [main]: Changing view acls to: trusted-service-user 2024-06-12 07:04:22,466 INFO SecurityManager [main]: Changing modify acls to: trusted-service-user 2024-06-12 07:04:22,467 INFO SecurityManager [main]: Changing view acls groups to: 2024-06-12 07:04:22,468 INFO SecurityManager [main]: Changing modify acls groups to: 2024-06-12 07:04:22,468 INFO SecurityManager [main]: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(trusted-service-user); groups with view permissions: Set(); users with modify permissions: Set(trusted-service-user); groups with modify permissions: Set() 2024-06-12 07:04:22,605 INFO MetricsConfig [AsyncAppender-Dispatcher-Thread-4]: Loaded properties from hadoop-metrics2.properties 2024-06-12 07:04:22,674 INFO MetricsSystemImpl [AsyncAppender-Dispatcher-Thread-4]: Scheduled Metric snapshot period at 10 second(s). 2024-06-12 07:04:22,674 INFO MetricsSystemImpl [AsyncAppender-Dispatcher-Thread-4]: azure-file-system metrics system started 2024-06-12 07:04:22,604 INFO ApplicationMaster [main]: ApplicationAttemptId: appattempt_1718175835080_0001_000001 2024-06-12 07:04:22,838 INFO ApplicationMaster [main]: Starting the user application in a separate Thread 2024-06-12 07:04:22,842 INFO ApplicationMaster [main]: Waiting for spark context initialization... 2024-06-12 07:04:22,925 INFO PythonRunner$ [Driver]: Initialized PythonRunnerOutputStream plugin org.apache.spark.microsoft.tools.api.plugin.MSToolsPythonRunnerOutputStreamPlugin. 2024-06-12 07:04:39,644 INFO SparkContext [Thread-47]: Running Spark version 3.3.1.5.2.20240509.1 2024-06-12 07:04:39,681 INFO ResourceUtils [Thread-47]: ============================================================== 2024-06-12 07:04:39,682 INFO ResourceUtils [Thread-47]: No custom resources configured for spark.driver. 2024-06-12 07:04:39,682 INFO ResourceUtils [Thread-47]: ============================================================== 2024-06-12 07:04:39,683 INFO SparkContext [Thread-47]: Submitted application: Azure ML Experiment 2024-06-12 07:04:39,722 INFO ResourceProfile [Thread-47]: Default ResourceProfile created, executor resources: Map(memoryOverhead -> name: memoryOverhead, amount: 384, script: , vendor: , cores -> name: cores, amount: 4, script: , vendor: , memory -> name: memory, amount: 4096, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0) 2024-06-12 07:04:39,728 INFO ResourceProfile [Thread-47]: Limiting resource is cpus at 4 tasks per executor 2024-06-12 07:04:39,730 INFO ResourceProfileManager [Thread-47]: Added ResourceProfile id: 0 2024-06-12 07:04:39,848 INFO SecurityManager [Thread-47]: Changing view acls to: trusted-service-user 2024-06-12 07:04:39,849 INFO SecurityManager [Thread-47]: Changing modify acls to: trusted-service-user 2024-06-12 07:04:39,849 INFO SecurityManager [Thread-47]: Changing view acls groups to: 2024-06-12 07:04:39,849 INFO SecurityManager [Thread-47]: Changing modify acls groups to: 2024-06-12 07:04:39,849 INFO SecurityManager [Thread-47]: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(trusted-service-user); groups with view permissions: Set(); users with modify permissions: Set(trusted-service-user); groups with modify permissions: Set() 2024-06-12 07:04:40,180 INFO Utils [Thread-47]: Successfully started service 'sparkDriver' on port 42075. 2024-06-12 07:04:40,280 INFO SparkEnv [Thread-47]: Registering MapOutputTracker 2024-06-12 07:04:40,357 INFO SparkEnv [Thread-47]: Registering BlockManagerMaster 2024-06-12 07:04:40,391 INFO BlockManagerMasterEndpoint [Thread-47]: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 2024-06-12 07:04:40,392 INFO BlockManagerMasterEndpoint [Thread-47]: BlockManagerMasterEndpoint up 2024-06-12 07:04:40,433 INFO SparkEnv [Thread-47]: Registering BlockManagerMasterHeartbeat 2024-06-12 07:04:40,456 INFO DiskBlockManager [Thread-47]: Created local directory at /mnt/var/hadoop/tmp/nm-local-dir/usercache/trusted-service-user/appcache/application_1718175835080_0001/blockmgr-f0632606-3744-432d-9824-9212c8736530 2024-06-12 07:04:40,478 INFO MemoryStore [Thread-47]: MemoryStore started with capacity 3.0 GiB 2024-06-12 07:04:40,527 INFO SparkEnv [Thread-47]: Registering OutputCommitCoordinator 2024-06-12 07:04:40,958 INFO Utils [Thread-47]: Successfully started service 'SparkUI' on port 33295. 2024-06-12 07:04:40,960 INFO ServerInfo [Thread-47]: Adding filter to /: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:41,226 INFO YarnClusterScheduler [Thread-47]: Created YarnClusterScheduler 2024-06-12 07:04:41,338 INFO Utils [Thread-47]: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 42761. 2024-06-12 07:04:41,339 INFO NettyBlockTransferService [Thread-47]: Server created on vm-58f13156:42761 2024-06-12 07:04:41,341 INFO BlockManager [Thread-47]: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 2024-06-12 07:04:41,342 INFO BlockManager [Thread-47]: external shuffle service port = 7337 2024-06-12 07:04:41,360 INFO BlockManagerMaster [Thread-47]: Registering BlockManager BlockManagerId(driver, vm-58f13156, 42761, None) 2024-06-12 07:04:41,364 INFO BlockManagerMasterEndpoint [dispatcher-BlockManagerMaster]: Registering block manager vm-58f13156:42761 with 3.0 GiB RAM, BlockManagerId(driver, vm-58f13156, 42761, None) 2024-06-12 07:04:41,367 INFO BlockManagerMaster [Thread-47]: Registered BlockManager BlockManagerId(driver, vm-58f13156, 42761, None) 2024-06-12 07:04:41,368 INFO BlockManager [Thread-47]: Initialized BlockManager: BlockManagerId(driver, vm-58f13156, 42761, None) 2024-06-12 07:04:41,406 INFO SparkObservabilityBus [SparkObservabilityManager-0]: After emitter initializes. 2024-06-12 07:04:41,414 INFO Logger [SparkObservabilityManager-0]: SparkDiagnosticEmitter: configured Shoebox 2024-06-12 07:04:41,417 INFO SparkObservabilityBus [SparkObservabilityManager-0]: SparkDiagnosticEmitter: ShoeboxEmitter initialized 2024-06-12 07:04:41,670 INFO SingleEventLogFileWriter [Thread-47]: Logging events to wasbs://9e2f8fd9-9d5f-4acd-99b5-3885490a4d31@hobostoragenue9ivxr1n.blob.core.windows.net/events/291/eventLogs/application_1718175835080_0001_1.inprogress 2024-06-12 07:04:41,829 INFO EnhancementLiveStatusPlugin [Thread-47]: Enhancement Live App Status Plugin initialization 2024-06-12 07:04:41,837 INFO EnhancementLiveStatusPlugin [Thread-47]: Live app enhancement was enabled 2024-06-12 07:04:41,846 INFO EnhancementAppStatusListener [Thread-47]: attach Enhancement AppStatus Listener on live application 2024-06-12 07:04:41,847 INFO DataOperations [Thread-47]: configuration: spark.data.maxRecords: 1000 2024-06-12 07:04:41,854 INFO EnhancementExecutorStoreWriter [Thread-47]: Use sync store writer for executor usage info update. 2024-06-12 07:04:41,857 INFO SparkContext [Thread-47]: Registered live app status plugin org.apache.spark.ui.EnhancementLiveStatusPlugin. 2024-06-12 07:04:41,889 INFO SparkContext [Thread-47]: Registered live app status plugin org.apache.spark.diagnostic.synapse.SparkDiagnosticPlugin. 2024-06-12 07:04:41,938 INFO RpcAppSparkContextServer [Thread-47]: Opening remote SparkContext service at 10.0.32.6:18083, remoteSparkContext/remoteSparkContextEndpoint 2024-06-12 07:04:41,948 INFO RemoteSparkContextServer [Thread-47]: Opening remote SparkContext server 2024-06-12 07:04:41,949 INFO SecurityManager [Thread-47]: Changing view acls to: trusted-service-user 2024-06-12 07:04:41,949 INFO SecurityManager [Thread-47]: Changing modify acls to: trusted-service-user 2024-06-12 07:04:41,949 INFO SecurityManager [Thread-47]: Changing view acls groups to: 2024-06-12 07:04:41,950 INFO SecurityManager [Thread-47]: Changing modify acls groups to: 2024-06-12 07:04:41,950 INFO SecurityManager [Thread-47]: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(trusted-service-user); groups with view permissions: Set(); users with modify permissions: Set(trusted-service-user); groups with modify permissions: Set() 2024-06-12 07:04:41,971 INFO Utils [Thread-47]: Successfully started service 'remoteSparkContext' on port 18083. 2024-06-12 07:04:41,972 INFO RemoteSparkContextServer [Thread-47]: Will serve remote SparkContext on 10.0.32.6:18083, remoteSparkContext/remoteSparkContextEndpoint 2024-06-12 07:04:42,176 INFO RpcAppListener [Thread-47]: Got host of RPC history server from node info: vm-58f13156 2024-06-12 07:04:42,178 INFO RpcAppSender [Thread-47]: Opening RPC app sender 2024-06-12 07:04:42,184 INFO SecurityManager [scala-execution-context-global-146]: Changing view acls to: trusted-service-user 2024-06-12 07:04:42,184 INFO SecurityManager [scala-execution-context-global-146]: Changing modify acls to: trusted-service-user 2024-06-12 07:04:42,184 INFO SecurityManager [scala-execution-context-global-146]: Changing view acls groups to: 2024-06-12 07:04:42,184 INFO SecurityManager [scala-execution-context-global-146]: Changing modify acls groups to: 2024-06-12 07:04:42,185 INFO SecurityManager [scala-execution-context-global-146]: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(trusted-service-user); groups with view permissions: Set(); users with modify permissions: Set(trusted-service-user); groups with modify permissions: Set() 2024-06-12 07:04:42,250 INFO TransportClientFactory [netty-rpc-connection-0]: Successfully created connection to vm-58f13156/10.0.32.6:18082 after 33 ms (0 ms spent in bootstraps) 2024-06-12 07:04:42,351 INFO RpcAppSender [Thread-47]: Will send events to RPC history server at vm-58f13156:18082, rpcHistoryServer/rpcHistoryServerEndpoint 2024-06-12 07:04:42,354 INFO SparkContext [Thread-47]: Set timeout to event queue sparkRpcHistoryServer. Timeout: 10000 2024-06-12 07:04:42,355 INFO SparkContext [Thread-47]: Registered live app status plugin org.apache.spark.deploy.history.rpc.app.RpcAppLivePlugin. Listeners added to queue: sparkRpcHistoryServer 2024-06-12 07:04:42,356 INFO SparklyrConnector [Thread-47]: Load sparklyr connector: org.apache.spark.sparklyr.DefaultConnector 2024-06-12 07:04:43,763 INFO SingleEventLogFileWriter [Thread-47]: Logging events to wasbs://9e2f8fd9-9d5f-4acd-99b5-3885490a4d31@hobostoragenue9ivxr1n.blob.core.windows.net/events/291/eventLogs/tmp/metrics_service/metric_file.inprogress 2024-06-12 07:04:43,766 INFO SparkContext [Thread-47]: Registered listener com.microsoft.hdinsight.spark.metrics.SparkMetricsListener 2024-06-12 07:04:43,766 INFO SparkContext [Thread-47]: Registered listener org.apache.spark.listeners.LogAnalyticsSparkListener 2024-06-12 07:04:43,766 INFO SparkContext [Thread-47]: Registered listener com.microsoft.impulse.analyze.eventLog.ImpulseListener 2024-06-12 07:04:43,767 INFO SparkContext [Thread-47]: Registered listener org.apache.spark.advise.input.MetricsServiceListener 2024-06-12 07:04:43,848 INFO RpcAppSender [spark-listener-group-sparkRpcHistoryServer]: Remote SparkContext server is opened on 10.0.32.6:18083, remoteSparkContext/remoteSparkContextEndpoint 2024-06-12 07:04:43,881 INFO ServerInfo [Thread-47]: Adding filter to /jobs: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,882 INFO ServerInfo [Thread-47]: Adding filter to /jobs/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,883 INFO ServerInfo [Thread-47]: Adding filter to /jobs/job: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,885 INFO ServerInfo [Thread-47]: Adding filter to /jobs/job/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,886 INFO ServerInfo [Thread-47]: Adding filter to /stages: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,886 INFO ServerInfo [Thread-47]: Adding filter to /stages/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,887 INFO ServerInfo [Thread-47]: Adding filter to /stages/stage: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,889 INFO ServerInfo [Thread-47]: Adding filter to /stages/stage/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,891 INFO ServerInfo [Thread-47]: Adding filter to /stages/pool: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,892 INFO ServerInfo [Thread-47]: Adding filter to /stages/pool/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,893 INFO ServerInfo [Thread-47]: Adding filter to /storage: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,894 INFO ServerInfo [Thread-47]: Adding filter to /storage/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,895 INFO ServerInfo [Thread-47]: Adding filter to /storage/rdd: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,896 INFO ServerInfo [Thread-47]: Adding filter to /storage/rdd/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,897 INFO ServerInfo [Thread-47]: Adding filter to /environment: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,898 INFO ServerInfo [Thread-47]: Adding filter to /environment/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,899 INFO ServerInfo [Thread-47]: Adding filter to /executors: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,900 INFO ServerInfo [Thread-47]: Adding filter to /executors/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,901 INFO ServerInfo [Thread-47]: Adding filter to /executors/threadDump: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,903 INFO ServerInfo [Thread-47]: Adding filter to /executors/threadDump/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,904 INFO ServerInfo [Thread-47]: Adding filter to /static: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,915 INFO ServerInfo [Thread-47]: Adding filter to /: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,917 INFO ServerInfo [Thread-47]: Adding filter to /api: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,920 INFO ServerInfo [Thread-47]: Adding filter to /metrics: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,921 INFO ServerInfo [Thread-47]: Adding filter to /jobs/job/kill: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,922 INFO ServerInfo [Thread-47]: Adding filter to /stages/stage/kill: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,927 INFO ServerInfo [Thread-47]: Adding filter to /metrics/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:43,928 INFO ServerInfo [Thread-47]: Adding filter to /metrics/prometheus: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:44,018 INFO RequestHedgingRMFailoverProxyProvider [main]: Created wrapped proxy for [rm1, rm2] 2024-06-12 07:04:44,020 INFO YarnRMClient [main]: Registering the ApplicationMaster 2024-06-12 07:04:44,025 INFO RequestHedgingRMFailoverProxyProvider [main]: Looking for the active RM in [rm1, rm2]... 2024-06-12 07:04:44,208 INFO RequestHedgingRMFailoverProxyProvider [main]: Found active RM [rm1] 2024-06-12 07:04:44,214 INFO ApplicationMaster [main]: Preparing Local resources 2024-06-12 07:04:44,330 INFO ApplicationMaster [main]: =============================================================================== Default YARN executor launch context: env: TID -> caa95068-7cc1-4c41-925f-875c22a5c4c9 AZUREML_OBO_CANARY_TOKEN -> eyJhbGciOiJSUzI1NiIsImtpZCI6IjI0Nzc2OEE4Rjc2OUVGRUFFMjk1QzU5QTExNkU5NjA5MDNBOTBGMkYiLCJ0eXAiOiJKV1QifQ.eyJyb2xlIjoiQ29udHJpYnV0b3IiLCJzY29wZSI6Ii9zdWJzY3JpcHRpb25zLzE0ZTRjMWM5LTU0MzctNGViOC04ZGFkLTQ1Njk2NzA3YzcyOS9yZXNvdXJjZUdyb3Vwcy9kcy1yZXNvdXJjZXMvcHJvdmlkZXJzL01pY3Jvc29mdC5NYWNoaW5lTGVhcm5pbmdTZXJ2aWNlcy93b3Jrc3BhY2VzL2RzLXdvcmtzcGFjZS9leHBlcmltZW50TmFtZS91bWljby1kcy1sb3lhbHR5LWZlZWQvcnVuSWQvcmVkX3NlZWRfbjl2NGtqc20xNiIsImFjY291bnRpZCI6IjAwMDAwMDAwLTAwMDAtMDAwMC0wMDAwLTAwMDAwMDAwMDAwMCIsIndvcmtzcGFjZUlkIjoiMDAwMDAwMDAtMDAwMC0wMDAwLTAwMDAtMDAwMDAwMDAwMDAwIiwicHJvamVjdGlkIjoiMDAwMDAwMDAtMDAwMC0wMDAwLTAwMDAtMDAwMDAwMDAwMDAwIiwiZGlzY292ZXJ5IjoidXJpOi8vZGlzY292ZXJ5dXJpLyIsInRpZCI6ImNhYTk1MDY4LTdjYzEtNGM0MS05MjVmLTg3NWMyMmE1YzRjOSIsIm9pZCI6IjE4OTFhNTUwLTQ3MDQtNGE1MS05NTNkLWE1ZTMzNGI3OTRmNSIsInB1aWQiOiIxMDAzMjAwMzIxMTZBM0VEIiwiaXNzIjoiYXp1cmVtbCIsImV4cCI6MTcxOTkwMzcxNCwiYXVkIjoiYXp1cmVtbCJ9.CYO9tBokxw5fqbaOoLDK5A5kZQfG8iqhET8GGNxxJPAHizmKPm8skfsUYWa3ucdTKqK6L6np0rUzwrht49ctua2G0ynBXGKuytVpBXpYuAZY_XcJQkkUiNp25sOtfYVyCrOYr-BNDwjwrcXHIgcV-L5mCBqw-1sVQZy0YBxoCRFEjJEpf_3Jp58H8JZ00X8kcWSpxz1La3tFbYNEF4j1TVokaQlw_hjyJLb0mFx-2kPaPgmITC1ZEvzXuuxlJ93BjwPZyie-3dWKoADBpgWFFJ0T5DaEcyF_gnbbFu9MgXKG5Io9jY1LIVo7x-ElIOgX4a_TNt6UHmKG90zxV6WJEA CLASSPATH -> /usr/lib/library-manager/bin/libraries/scala/*:/usr/lib/dw-connector/synapse/*{{PWD}}{{PWD}}/__spark_conf__{{PWD}}/__spark_libs__/*/opt/spark/jars/*{{PWD}}/__spark_conf__/__hadoop_conf__ AZUREML_OBO_USER_TOKEN_FOR_SPARK_RETRIEVAL_API -> getuseraccesstokenforspark AZUREML_ARM_SUBSCRIPTION -> 14e4c1c9-5437-4eb8-8dad-45696707c729 AZUREML_ARM_RESOURCEGROUP -> ds-resources AZUREML_OBO_SERVICE_ENDPOINT -> https://westeurope.api.azureml.ms AZUREML_RUN_TOKEN_EXPIRY -> 1719997314 SPARK_YARN_STAGING_DIR -> wasbs://9e2f8fd9-9d5f-4acd-99b5-3885490a4d31@hobostoragenue9ivxr1n.blob.core.windows.net/user/trusted-service-user/trusted-service-user/.sparkStaging/application_1718175835080_0001 PATH -> /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin:/usr/local/cuda-11.5/bin:/home/trusted-service-user/cluster-env/env/bin:/home/trusted-service-user/cluster-env/synapse_trident_r/bin SPARK_USER -> trusted-service-user AZUREML_ARM_PROJECT_NAME -> umico-ds-loyalty-feed SPARKR_INLINE_SESSION_LEVEL_ENABLE -> true JAVA_TOOL_OPTIONS -> -Djdk.jar.maxSignatureFileSize=2147483639 AZUREML_WORKSPACE_ID -> 25d827f3-cf10-4e2a-b65d-40c316812ddd AZUREML_RUN_ID -> red_seed_n9v4kjsm16 SPARK_HOME -> /opt/spark PYTHONPATH -> /opt/spark/python/lib/pyspark.zip/opt/spark/python/lib/py4j-0.10.7-src.zip{{PWD}}/source.zip{{PWD}}/setup.zip AZUREML_RUN_TOKEN -> eyJhbGciOiJSUzI1NiIsImtpZCI6IjI0Nzc2OEE4Rjc2OUVGRUFFMjk1QzU5QTExNkU5NjA5MDNBOTBGMkYiLCJ0eXAiOiJKV1QifQ.eyJyb2xlIjoiQ29udHJpYnV0b3IiLCJzY29wZSI6Ii9zdWJzY3JpcHRpb25zLzE0ZTRjMWM5LTU0MzctNGViOC04ZGFkLTQ1Njk2NzA3YzcyOS9yZXNvdXJjZUdyb3Vwcy9kcy1yZXNvdXJjZXMvcHJvdmlkZXJzL01pY3Jvc29mdC5NYWNoaW5lTGVhcm5pbmdTZXJ2aWNlcy93b3Jrc3BhY2VzL2RzLXdvcmtzcGFjZSIsImFjY291bnRpZCI6IjAwMDAwMDAwLTAwMDAtMDAwMC0wMDAwLTAwMDAwMDAwMDAwMCIsIndvcmtzcGFjZUlkIjoiMjVkODI3ZjMtY2YxMC00ZTJhLWI2NWQtNDBjMzE2ODEyZGRkIiwicHJvamVjdGlkIjoiMDAwMDAwMDAtMDAwMC0wMDAwLTAwMDAtMDAwMDAwMDAwMDAwIiwiZGlzY292ZXJ5IjoidXJpOi8vZGlzY292ZXJ5dXJpLyIsInRpZCI6ImNhYTk1MDY4LTdjYzEtNGM0MS05MjVmLTg3NWMyMmE1YzRjOSIsIm9pZCI6IjE4OTFhNTUwLTQ3MDQtNGE1MS05NTNkLWE1ZTMzNGI3OTRmNSIsInB1aWQiOiIxMDAzMjAwMzIxMTZBM0VEIiwiaXNzIjoiYXp1cmVtbCIsImFwcGlkIjoiQmFraHJ1eiBEemhhZmFyb3YiLCJleHAiOjE3MTk5OTczMTQsImF1ZCI6ImF6dXJlbWwifQ.qMiL7T_ZZwGiGLKd9yOQLAI9KvI-W4_xIDoPsl42rL6Gu1uJGlbD5XxOAcfoVzFnKAH7tooTKsVvWUQ4xV9EoHoEwlIQo3psQKnL2QRZ4mxB8i6vEOU8vEu0oVwMvPwoEP3fcw1cPCSrPOSYkexvIqFoQj5HUtbQibEmcFQUhRgDW1G0pT5UWO_uyuro_pR5enTxwNS3F6MPQKHTXgBGl_4nyM1K21yLEvp2P7jUfz7C3sG_rdcAeWA4IZWAYWZUfOO2Zi2d9Id7WnX8yOekNldT1aUFBUwajrkFZmR5qpzqGl0VkmMcGNz2cJCxydbUiIXqG3zU3MHh-oPnj5giKw OID -> 1891a550-4704-4a51-953d-a5e334b794f5 AZUREML_DATAPREP_TOKEN_PROVIDER -> sparkobo AZUREML_SERVICE_CERT_ENDPOINT -> https://westeurope.cert.api.azureml.ms AZUREML_SERVICE_ENDPOINT -> https://westeurope.api.azureml.ms AZUREML_ARM_WORKSPACE_NAME -> ds-workspace AZUREML_EXPERIMENT_ID -> 63a713bb-3fa2-48af-a358-bc0522d4b3e1 command: LD_LIBRARY_PATH=\"/usr/hdp/current/hadoop-client/lib/native:$LD_LIBRARY_PATH\" \ {{JAVA_HOME}}/bin/java \ -server \ -Xmx4096m \ '-XX:+IgnoreUnrecognizedVMOptions' \ '--add-opens=java.base/java.lang=ALL-UNNAMED' \ '--add-opens=java.base/java.lang.invoke=ALL-UNNAMED' \ '--add-opens=java.base/java.lang.reflect=ALL-UNNAMED' \ '--add-opens=java.base/java.io=ALL-UNNAMED' \ '--add-opens=java.base/java.net=ALL-UNNAMED' \ '--add-opens=java.base/java.nio=ALL-UNNAMED' \ '--add-opens=java.base/java.util=ALL-UNNAMED' \ '--add-opens=java.base/java.util.concurrent=ALL-UNNAMED' \ '--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED' \ '--add-opens=java.base/sun.nio.ch=ALL-UNNAMED' \ '--add-opens=java.base/sun.nio.cs=ALL-UNNAMED' \ '--add-opens=java.base/sun.security.action=ALL-UNNAMED' \ '--add-opens=java.base/sun.util.calendar=ALL-UNNAMED' \ '--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED' \ '-Dlog4j2.configurationFile=file:/usr/hdp/current/spark3-client/conf/executor-log4j2.properties' \ '-Djavax.xml.parsers.SAXParserFactory=com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl' \ '-XX:+UseG1GC' \ -Djava.io.tmpdir={{PWD}}/tmp \ '-Dspark.driver.port=42075' \ '-Dspark.synapse.history.rpc.port=18082' \ '-Dspark.ui.port=0' \ '-Dspark.history.ui.port=18080' \ -Dspark.yarn.app.container.log.dir= \ -XX:OnOutOfMemoryError='kill %p' \ org.apache.spark.executor.YarnCoarseGrainedExecutorBackend \ --driver-url \ spark://CoarseGrainedScheduler@vm-58f13156:42075 \ --executor-id \ \ --hostname \ \ --cores \ 4 \ --app-id \ application_1718175835080_0001 \ --resourceProfileId \ 0 \ 1>/stdout \ 2>/stderr resources: setup.zip -> resource { scheme: "wasbs" host: "hobostoragenue9ivxr1n.blob.core.windows.net" port: -1 file: "/user/trusted-service-user/trusted-service-user/.sparkStaging/application_1718175835080_0001/setup.zip" userInfo: "9e2f8fd9-9d5f-4acd-99b5-3885490a4d31" } size: 92383 timestamp: 1718175856000 type: FILE visibility: PRIVATE __spark_conf__ -> resource { scheme: "wasbs" host: "hobostoragenue9ivxr1n.blob.core.windows.net" port: -1 file: "/user/trusted-service-user/trusted-service-user/.sparkStaging/application_1718175835080_0001/__spark_conf__.zip" userInfo: "9e2f8fd9-9d5f-4acd-99b5-3885490a4d31" } size: 365244 timestamp: 1718175857000 type: ARCHIVE visibility: PRIVATE source.zip -> resource { scheme: "wasbs" host: "hobostoragenue9ivxr1n.blob.core.windows.net" port: -1 file: "/user/trusted-service-user/trusted-service-user/.sparkStaging/application_1718175835080_0001/source.zip" userInfo: "9e2f8fd9-9d5f-4acd-99b5-3885490a4d31" } size: 2329824 timestamp: 1718175856000 type: FILE visibility: PRIVATE =============================================================================== 2024-06-12 07:04:44,366 INFO YarnAllocator [main]: Yarn Executor Decommissioning Enabled 2024-06-12 07:04:44,380 INFO YarnAllocator [main]: Resource profile 0 doesn't exist, adding it 2024-06-12 07:04:44,404 INFO Configuration [main]: resource-types.xml not found 2024-06-12 07:04:44,404 INFO ResourceUtils [main]: Unable to find 'resource-types.xml'. 2024-06-12 07:04:44,413 INFO YarnSchedulerBackend$YarnSchedulerEndpoint [dispatcher-event-loop-0]: ApplicationMaster registered as NettyRpcEndpointRef(spark://YarnAM@vm-58f13156:42075) 2024-06-12 07:04:44,421 INFO YarnAllocator [main]: Will request 2 executor container(s) for ResourceProfile Id: 0, each with 4 core(s) and 4480 MB memory. with custom resources: 2024-06-12 07:04:44,437 INFO YarnAllocator [main]: Submitted 2 unlocalized container requests. 2024-06-12 07:04:44,450 INFO RpcAppSender [spark-listener-group-sparkRpcHistoryServer]: Sent driver info of application_1718175835080_0001/1 to RPC history server 2024-06-12 07:04:44,453 INFO DefaultsConfigSparkListener [spark-listener-group-shared]: Persisted __spark_conf_merge_records__.json 2024-06-12 07:04:44,496 INFO ApplicationMaster [main]: Started progress reporter thread with (heartbeat : 1000, initial allocation : 200) intervals 2024-06-12 07:04:44,497 INFO YarnClusterSchedulerBackend [Thread-47]: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0 2024-06-12 07:04:44,498 INFO YarnClusterScheduler [Thread-47]: YarnClusterScheduler.postStartHook done 2024-06-12 07:04:44,534 INFO SparkContext [Thread-47]: Initialized SparkContextAfterInit plugin org.apache.spark.microsoft.tools.api.plugin.MSToolsSparkContextAfterInitPlugin. 2024-06-12 07:04:44,552 INFO YarnAllocator [Reporter]: Launching container container_1718175835080_0001_01_000002 on host vm-58f13156 for executor with ID 1 for ResourceProfile Id 0 2024-06-12 07:04:44,555 INFO YarnAllocator [Reporter]: Received 1 containers from YARN, launching executors on 1 of them. 2024-06-12 07:04:44,582 INFO ExecutorRunnable [ContainerLauncher-0]: Initializing service data for shuffle service using name 'spark_shuffle' 2024-06-12 07:04:44,625 INFO LighterServerPlugin [Thread-47]: Loaded Lighter server plugin: org.apache.spark.lighter.DefaultLighterServerPlugin 2024-06-12 07:04:45,166 INFO YarnAllocator [Reporter]: Launching container container_1718175835080_0001_01_000003 on host vm-14223739 for executor with ID 2 for ResourceProfile Id 0 2024-06-12 07:04:45,167 INFO YarnAllocator [Reporter]: Received 1 containers from YARN, launching executors on 1 of them. 2024-06-12 07:04:45,170 INFO ExecutorRunnable [ContainerLauncher-1]: Initializing service data for shuffle service using name 'spark_shuffle' 2024-06-12 07:04:48,301 INFO YarnSchedulerBackend$YarnDriverEndpoint [dispatcher-CoarseGrainedScheduler]: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.0.32.6:33208) with ID 1, ResourceProfileId 0 2024-06-12 07:04:48,326 INFO SharedState [spark-listener-group-shared]: Setting hive.metastore.warehouse.dir ('wasbs://9e2f8fd9-9d5f-4acd-99b5-3885490a4d31@hobostoragenue9ivxr1n.blob.core.windows.net/synapse/workspaces/25d827f3-cf10-4e2a-b65d-40c316812ddd/warehouse') to the value of spark.sql.warehouse.dir. 2024-06-12 07:04:48,331 INFO SharedState [spark-listener-group-shared]: Warehouse path is 'wasbs://9e2f8fd9-9d5f-4acd-99b5-3885490a4d31@hobostoragenue9ivxr1n.blob.core.windows.net/synapse/workspaces/25d827f3-cf10-4e2a-b65d-40c316812ddd/warehouse'. 2024-06-12 07:04:48,406 INFO ServerInfo [spark-listener-group-shared]: Adding filter to /SQL: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:48,407 INFO ServerInfo [spark-listener-group-shared]: Adding filter to /SQL/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:48,409 INFO ServerInfo [spark-listener-group-shared]: Adding filter to /SQL/execution: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:48,410 INFO ServerInfo [spark-listener-group-shared]: Adding filter to /SQL/execution/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:48,412 INFO ServerInfo [spark-listener-group-shared]: Adding filter to /static/sql: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2024-06-12 07:04:48,425 WARN SQLConf [spark-listener-group-shared]: The SQL config 'spark.sql.legacy.replaceDatabricksSparkAvro.enabled' has been deprecated in Spark v3.2 and may be removed in the future. Use `.format("avro")` in `DataFrameWriter` or `DataFrameReader` instead. 2024-06-12 07:04:48,431 WARN SQLConf [spark-listener-group-shared]: The SQL config 'spark.sql.legacy.replaceDatabricksSparkAvro.enabled' has been deprecated in Spark v3.2 and may be removed in the future. Use `.format("avro")` in `DataFrameWriter` or `DataFrameReader` instead. 2024-06-12 07:04:48,527 INFO BlockManagerMasterEndpoint [dispatcher-BlockManagerMaster]: Registering block manager vm-58f13156:40101 with 2.2 GiB RAM, BlockManagerId(1, vm-58f13156, 40101, None) 2024-06-12 07:04:49,793 INFO AsyncEventQueue [spark-listener-group-shared]: Process of event SparkListenerExecutorAdded(1718175888304,1,org.apache.spark.scheduler.cluster.ExecutorData@5163fdf4) by listener SparkMetricsListener took 1.479425315s. 2024-06-12 07:04:51,629 WARN SparkSession [Thread-47]: Using an existing Spark session; only runtime SQL configurations will take effect. 2024-06-12 07:04:51,646 WARN SparkSession [Thread-47]: Using an existing Spark session; only runtime SQL configurations will take effect. 2024-06-12 07:04:54,903 INFO notebookUtils [Thread-47]: [mount operation] start mount wasbs://azureml@dsmlstoragexfsyt.blob.core.windows.net/ExperimentRun/dcid.red_seed_n9v4kjsm16 to /AmlJobLogs/dcid.red_seed_n9v4kjsm16 2024-06-12 07:04:54,905 INFO notebookUtils [Thread-47]: removeJobIdFromMountedPath is true, creating mount point without jobId as prefix 2024-06-12 07:04:54,944 INFO notebookUtils [Thread-47]: Init AzureStoreUtils with url https://hobostoragenue9ivxr1n.blob.core.windows.net/9e2f8fd9-9d5f-4acd-99b5-3885490a4d31 2024-06-12 07:04:54,945 INFO notebookUtils [Thread-47]: checkBlobIfExists - /mounts/mount.json start execute request. 2024-06-12 07:04:55,200 INFO notebookUtils [Thread-47]: checkBlobIfExists - /mounts/mount.json execute request successfully. 2024-06-12 07:04:55,202 INFO notebookUtils [Thread-47]: createNewBlob - /mounts/mount.json start execute request. 2024-06-12 07:04:55,215 INFO notebookUtils [Thread-47]: createNewBlob - /mounts/mount.json execute request successfully. 2024-06-12 07:04:55,216 INFO notebookUtils [Thread-47]: getBlobContent - /mounts/mount.json start execute request. 2024-06-12 07:04:55,223 INFO notebookUtils [Thread-47]: getBlobContent - /mounts/mount.json execute request successfully. 2024-06-12 07:04:55,230 INFO notebookUtils [Thread-47]: found sasToken. 2024-06-12 07:04:55,231 INFO notebookUtils [Thread-47]: get folder path from source URL - ExperimentRun/dcid.red_seed_n9v4kjsm16 2024-06-12 07:04:55,232 INFO notebookUtils [Thread-47]: [pre-verify before mount] sent list call to https://dsmlstoragexfsyt.blob.core.windows.net/azureml?restype=container&comp=list&prefix=ExperimentRun/dcid.red_seed_n9v4kjsm16/&maxresults=1 2024-06-12 07:04:55,233 INFO notebookUtils [Thread-47]: [pre-verify before mount] storage account name is dsmlstoragexfsyt 2024-06-12 07:04:55,234 INFO notebookUtils [Thread-47]: [pre-verify before mount] checking with sasToken 2024-06-12 07:04:55,273 INFO notebookUtils [Thread-47]: [pre-verify before mount] user has permission to access wasbs://azureml@dsmlstoragexfsyt.blob.core.windows.net/ExperimentRun/dcid.red_seed_n9v4kjsm16 cost 42 ms 2024-06-12 07:04:55,275 INFO notebookUtils [Thread-47]: Try to create __lock__ file before do mount or unmount 2024-06-12 07:04:55,275 INFO notebookUtils [Thread-47]: createNewBlob - /mounts/__lock__ start execute request. 2024-06-12 07:04:55,284 INFO notebookUtils [Thread-47]: createNewBlob - /mounts/__lock__ execute request successfully. 2024-06-12 07:04:55,285 INFO notebookUtils [Thread-47]: successfully create __lock__ file 2024-06-12 07:04:55,285 INFO notebookUtils [Thread-47]: checkBlobIfExists - /mounts/mount.json start execute request. 2024-06-12 07:04:55,291 INFO notebookUtils [Thread-47]: checkBlobIfExists - /mounts/mount.json execute request successfully. 2024-06-12 07:04:55,292 INFO notebookUtils [Thread-47]: getBlobContent - /mounts/mount.json start execute request. 2024-06-12 07:04:55,298 INFO notebookUtils [Thread-47]: getBlobContent - /mounts/mount.json execute request successfully. 2024-06-12 07:04:55,306 INFO notebookUtils [Thread-47]: [BaseOperation] start creating operation - f834bd6a-3395-476c-bfe6-d3c2eedbfbae 2024-06-12 07:04:55,306 INFO notebookUtils [Thread-47]: folder is ExperimentRun/dcid.red_seed_n9v4kjsm16 2024-06-12 07:04:55,349 INFO notebookUtils [Thread-47]: Staring to create operation with id - f834bd6a-3395-476c-bfe6-d3c2eedbfbae 2024-06-12 07:04:55,408 INFO notebookUtils [Thread-47]: [createOp - f834bd6a-3395-476c-bfe6-d3c2eedbfbae] Request success. 2024-06-12 07:04:55,409 INFO notebookUtils [Thread-47]: Creating operation with id f834bd6a-3395-476c-bfe6-d3c2eedbfbae finished. 2024-06-12 07:04:55,410 INFO notebookUtils [Thread-47]: [BaseOperation] start trigger operation action - f834bd6a-3395-476c-bfe6-d3c2eedbfbae 2024-06-12 07:04:55,411 INFO notebookUtils [Thread-47]: Staring to run operation action with operation id - f834bd6a-3395-476c-bfe6-d3c2eedbfbae 2024-06-12 07:04:55,422 INFO notebookUtils [Thread-47]: [runOperationAction - f834bd6a-3395-476c-bfe6-d3c2eedbfbae] Request success. 2024-06-12 07:04:55,422 INFO notebookUtils [Thread-47]: Running operation action with id f834bd6a-3395-476c-bfe6-d3c2eedbfbae finished. 2024-06-12 07:04:55,423 INFO notebookUtils [Thread-47]: Staring to get operation action status with operation id - f834bd6a-3395-476c-bfe6-d3c2eedbfbae 2024-06-12 07:04:55,434 INFO notebookUtils [Thread-47]: [getOperationActionStatus - f834bd6a-3395-476c-bfe6-d3c2eedbfbae] Request success. 2024-06-12 07:04:55,452 INFO notebookUtils [Thread-47]: Getting operation action status with id f834bd6a-3395-476c-bfe6-d3c2eedbfbae finished. 2024-06-12 07:04:55,453 INFO notebookUtils [Thread-47]: [BaseOperation] operation timeout be set to 600s 2024-06-12 07:04:55,455 INFO notebookUtils [Thread-47]: [BaseOperation] operation action still in InProgress status, waiting another 10 seconds. 2024-06-12 07:05:05,456 INFO notebookUtils [Thread-47]: Staring to get operation action status with operation id - f834bd6a-3395-476c-bfe6-d3c2eedbfbae 2024-06-12 07:05:05,460 INFO notebookUtils [Thread-47]: [getOperationActionStatus - f834bd6a-3395-476c-bfe6-d3c2eedbfbae] Request success. 2024-06-12 07:05:05,461 INFO notebookUtils [Thread-47]: Getting operation action status with id f834bd6a-3395-476c-bfe6-d3c2eedbfbae finished. 2024-06-12 07:05:05,462 INFO notebookUtils [Thread-47]: [BaseOperation] operation action f834bd6a-3395-476c-bfe6-d3c2eedbfbae finished after 10 seconds with status Succeeded 2024-06-12 07:05:05,463 INFO notebookUtils [Thread-47]: mount wasbs://azureml@dsmlstoragexfsyt.blob.core.windows.net/ExperimentRun/dcid.red_seed_n9v4kjsm16 to /AmlJobLogs/dcid.red_seed_n9v4kjsm16 successfully 2024-06-12 07:05:05,469 INFO notebookUtils [Thread-47]: createNewBlob - /mounts/mount.json start execute request. 2024-06-12 07:05:05,479 INFO notebookUtils [Thread-47]: createNewBlob - /mounts/mount.json execute request successfully. 2024-06-12 07:05:05,479 INFO notebookUtils [Thread-47]: Try to release lock for mount info config file 2024-06-12 07:05:05,480 INFO notebookUtils [Thread-47]: checkBlobIfExists - /mounts/__lock__ start execute request. 2024-06-12 07:05:05,485 INFO notebookUtils [Thread-47]: checkBlobIfExists - /mounts/__lock__ execute request successfully. 2024-06-12 07:05:05,486 INFO notebookUtils [Thread-47]: deleteBlob - /mounts/__lock__ start execute request. 2024-06-12 07:05:05,494 INFO notebookUtils [Thread-47]: deleteBlob - /mounts/__lock__ execute request successfully. 2024-06-12 07:05:05,494 INFO notebookUtils [Thread-47]: successfully release lock for mountinfo config file. 2024-06-12 07:05:05,495 INFO notebookUtils [Thread-47]: [telemetry][info][funcName:MountStorageByMssparkutils|cost:10619|traceId:186a9e99-4fe7-4e5e-bbae-3cc1191e6667|productType:aml|storageType:Blob Storage] exec success 2024-06-12 07:05:05,504 INFO notebookUtils [Thread-47]: [telemetry][info][funcName:GetMountPathByMssparkutils|cost:8|traceId:7fdb0871-249b-458f-af1e-fd1de5a46ee1|productType:aml] exec success 2024-06-12 07:05:09,220 INFO YarnSchedulerBackend$YarnDriverEndpoint [dispatcher-CoarseGrainedScheduler]: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.0.32.4:57308) with ID 2, ResourceProfileId 0 2024-06-12 07:05:09,560 INFO BlockManagerMasterEndpoint [dispatcher-BlockManagerMaster]: Registering block manager vm-14223739:44757 with 2.2 GiB RAM, BlockManagerId(2, vm-14223739, 44757, None) 2024-06-12 07:05:21,038 INFO CosmosItemsDataSource [Thread-47]: Instantiated CosmosItemsDataSource 2024-06-12 07:05:21,068 INFO CosmosItemsDataSource [Thread-47]: Instantiated CosmosItemsDataSource 2024-06-12 07:05:21,219 INFO InMemoryFileIndex [Thread-47]: It took 50 ms to list leaf files for 1 paths. 2024-06-12 07:05:21,554 INFO SparkContext [Thread-47]: Starting job: parquet at NativeMethodAccessorImpl.java:0 2024-06-12 07:05:21,573 INFO DAGScheduler [dag-scheduler-event-loop]: Got job 0 (parquet at NativeMethodAccessorImpl.java:0) with 1 output partitions 2024-06-12 07:05:21,573 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ResultStage 0 (parquet at NativeMethodAccessorImpl.java:0) 2024-06-12 07:05:21,574 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List() 2024-06-12 07:05:21,575 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:05:21,580 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ResultStage 0 (MapPartitionsRDD[1] at parquet at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:05:21,784 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_0 stored as values in memory (estimated size 206.4 KiB, free 3.0 GiB) 2024-06-12 07:05:22,280 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_0_piece0 stored as bytes in memory (estimated size 57.4 KiB, free 3.0 GiB) 2024-06-12 07:05:22,284 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_0_piece0 in memory on vm-58f13156:42761 (size: 57.4 KiB, free: 3.0 GiB) 2024-06-12 07:05:22,288 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 0 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:05:22,301 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at parquet at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0)) 2024-06-12 07:05:22,302 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 0.0 with 1 tasks resource profile 0 2024-06-12 07:05:22,335 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 0.0 (TID 0) (vm-58f13156, executor 1, partition 0, PROCESS_LOCAL, 7806 bytes) taskResourceAssignments Map() 2024-06-12 07:05:22,902 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_0_piece0 in memory on vm-58f13156:40101 (size: 57.4 KiB, free: 2.2 GiB) 2024-06-12 07:05:25,082 INFO TaskSetManager [task-result-getter-0]: Finished task 0.0 in stage 0.0 (TID 0) in 2758 ms on vm-58f13156 (executor 1) (1/1) 2024-06-12 07:05:25,085 INFO YarnClusterScheduler [task-result-getter-0]: Removed TaskSet 0.0, whose tasks have all completed, from pool 2024-06-12 07:05:25,100 INFO DAGScheduler [dag-scheduler-event-loop]: ResultStage 0 (parquet at NativeMethodAccessorImpl.java:0) finished in 3.468 s 2024-06-12 07:05:25,104 INFO DAGScheduler [dag-scheduler-event-loop]: Job 0 is finished. Cancelling potential speculative or zombie tasks for this job 2024-06-12 07:05:25,104 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Killing all running tasks in stage 0: Stage finished 2024-06-12 07:05:25,110 INFO DAGScheduler [Thread-47]: Job 0 finished: parquet at NativeMethodAccessorImpl.java:0, took 3.555429 s 2024-06-12 07:05:25,677 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_0_piece0 on vm-58f13156:42761 in memory (size: 57.4 KiB, free: 3.0 GiB) 2024-06-12 07:05:25,685 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_0_piece0 on vm-58f13156:40101 in memory (size: 57.4 KiB, free: 2.2 GiB) 2024-06-12 07:05:26,167 INFO CosmosItemsDataSource [Thread-47]: Instantiated CosmosItemsDataSource 2024-06-12 07:05:26,169 INFO CosmosItemsDataSource [Thread-47]: Instantiated CosmosItemsDataSource 2024-06-12 07:05:26,638 INFO InMemoryFileIndex [Thread-47]: It took 36 ms to list leaf files for 1 paths. 2024-06-12 07:05:26,715 INFO SparkContext [Thread-47]: Starting job: parquet at NativeMethodAccessorImpl.java:0 2024-06-12 07:05:26,716 INFO DAGScheduler [dag-scheduler-event-loop]: Got job 1 (parquet at NativeMethodAccessorImpl.java:0) with 1 output partitions 2024-06-12 07:05:26,716 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ResultStage 1 (parquet at NativeMethodAccessorImpl.java:0) 2024-06-12 07:05:26,716 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List() 2024-06-12 07:05:26,716 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:05:26,718 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ResultStage 1 (MapPartitionsRDD[3] at parquet at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:05:26,739 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_1 stored as values in memory (estimated size 206.6 KiB, free 3.0 GiB) 2024-06-12 07:05:26,741 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_1_piece0 stored as bytes in memory (estimated size 57.4 KiB, free 3.0 GiB) 2024-06-12 07:05:26,741 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_1_piece0 in memory on vm-58f13156:42761 (size: 57.4 KiB, free: 3.0 GiB) 2024-06-12 07:05:26,742 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 1 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:05:26,743 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 1 missing tasks from ResultStage 1 (MapPartitionsRDD[3] at parquet at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0)) 2024-06-12 07:05:26,743 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 1.0 with 1 tasks resource profile 0 2024-06-12 07:05:26,745 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 1.0 (TID 1) (vm-14223739, executor 2, partition 0, PROCESS_LOCAL, 7990 bytes) taskResourceAssignments Map() 2024-06-12 07:05:28,238 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_1_piece0 in memory on vm-14223739:44757 (size: 57.4 KiB, free: 2.2 GiB) 2024-06-12 07:05:32,531 INFO TaskSetManager [task-result-getter-1]: Finished task 0.0 in stage 1.0 (TID 1) in 5787 ms on vm-14223739 (executor 2) (1/1) 2024-06-12 07:05:32,531 INFO YarnClusterScheduler [task-result-getter-1]: Removed TaskSet 1.0, whose tasks have all completed, from pool 2024-06-12 07:05:32,532 INFO DAGScheduler [dag-scheduler-event-loop]: ResultStage 1 (parquet at NativeMethodAccessorImpl.java:0) finished in 5.813 s 2024-06-12 07:05:32,533 INFO DAGScheduler [dag-scheduler-event-loop]: Job 1 is finished. Cancelling potential speculative or zombie tasks for this job 2024-06-12 07:05:32,533 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Killing all running tasks in stage 1: Stage finished 2024-06-12 07:05:32,533 INFO DAGScheduler [Thread-47]: Job 1 finished: parquet at NativeMethodAccessorImpl.java:0, took 5.818451 s 2024-06-12 07:05:32,968 INFO CosmosItemsDataSource [Thread-47]: Instantiated CosmosItemsDataSource 2024-06-12 07:05:32,970 INFO CosmosItemsDataSource [Thread-47]: Instantiated CosmosItemsDataSource 2024-06-12 07:05:33,106 INFO InMemoryFileIndex [Thread-47]: It took 31 ms to list leaf files for 1 paths. 2024-06-12 07:05:33,171 INFO SparkContext [Thread-47]: Starting job: parquet at NativeMethodAccessorImpl.java:0 2024-06-12 07:05:33,172 INFO DAGScheduler [dag-scheduler-event-loop]: Got job 2 (parquet at NativeMethodAccessorImpl.java:0) with 1 output partitions 2024-06-12 07:05:33,172 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ResultStage 2 (parquet at NativeMethodAccessorImpl.java:0) 2024-06-12 07:05:33,172 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List() 2024-06-12 07:05:33,172 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:05:33,174 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ResultStage 2 (MapPartitionsRDD[5] at parquet at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:05:33,195 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_2 stored as values in memory (estimated size 206.5 KiB, free 3.0 GiB) 2024-06-12 07:05:33,197 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_2_piece0 stored as bytes in memory (estimated size 57.3 KiB, free 3.0 GiB) 2024-06-12 07:05:33,198 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_2_piece0 in memory on vm-58f13156:42761 (size: 57.3 KiB, free: 3.0 GiB) 2024-06-12 07:05:33,198 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 2 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:05:33,199 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 1 missing tasks from ResultStage 2 (MapPartitionsRDD[5] at parquet at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0)) 2024-06-12 07:05:33,199 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 2.0 with 1 tasks resource profile 0 2024-06-12 07:05:33,201 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 2.0 (TID 2) (vm-14223739, executor 2, partition 0, PROCESS_LOCAL, 7987 bytes) taskResourceAssignments Map() 2024-06-12 07:05:33,237 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_2_piece0 in memory on vm-14223739:44757 (size: 57.3 KiB, free: 2.2 GiB) 2024-06-12 07:05:33,366 INFO TaskSetManager [task-result-getter-2]: Finished task 0.0 in stage 2.0 (TID 2) in 166 ms on vm-14223739 (executor 2) (1/1) 2024-06-12 07:05:33,367 INFO YarnClusterScheduler [task-result-getter-2]: Removed TaskSet 2.0, whose tasks have all completed, from pool 2024-06-12 07:05:33,368 INFO DAGScheduler [dag-scheduler-event-loop]: ResultStage 2 (parquet at NativeMethodAccessorImpl.java:0) finished in 0.193 s 2024-06-12 07:05:33,369 INFO DAGScheduler [dag-scheduler-event-loop]: Job 2 is finished. Cancelling potential speculative or zombie tasks for this job 2024-06-12 07:05:33,369 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Killing all running tasks in stage 2: Stage finished 2024-06-12 07:05:33,369 INFO DAGScheduler [Thread-47]: Job 2 finished: parquet at NativeMethodAccessorImpl.java:0, took 0.197981 s 2024-06-12 07:05:33,532 INFO CosmosItemsDataSource [Thread-47]: Instantiated CosmosItemsDataSource 2024-06-12 07:05:33,534 INFO CosmosItemsDataSource [Thread-47]: Instantiated CosmosItemsDataSource 2024-06-12 07:05:33,567 INFO InMemoryFileIndex [Thread-47]: It took 9 ms to list leaf files for 1 paths. 2024-06-12 07:05:33,627 INFO SparkContext [Thread-47]: Starting job: parquet at NativeMethodAccessorImpl.java:0 2024-06-12 07:05:33,628 INFO DAGScheduler [dag-scheduler-event-loop]: Got job 3 (parquet at NativeMethodAccessorImpl.java:0) with 1 output partitions 2024-06-12 07:05:33,628 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ResultStage 3 (parquet at NativeMethodAccessorImpl.java:0) 2024-06-12 07:05:33,628 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List() 2024-06-12 07:05:33,628 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:05:33,630 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ResultStage 3 (MapPartitionsRDD[7] at parquet at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:05:33,652 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_3 stored as values in memory (estimated size 206.6 KiB, free 3.0 GiB) 2024-06-12 07:05:33,653 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_3_piece0 stored as bytes in memory (estimated size 57.4 KiB, free 3.0 GiB) 2024-06-12 07:05:33,654 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_3_piece0 in memory on vm-58f13156:42761 (size: 57.4 KiB, free: 3.0 GiB) 2024-06-12 07:05:33,654 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 3 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:05:33,655 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 1 missing tasks from ResultStage 3 (MapPartitionsRDD[7] at parquet at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0)) 2024-06-12 07:05:33,655 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 3.0 with 1 tasks resource profile 0 2024-06-12 07:05:33,656 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 3.0 (TID 3) (vm-58f13156, executor 1, partition 0, PROCESS_LOCAL, 7808 bytes) taskResourceAssignments Map() 2024-06-12 07:05:33,685 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_3_piece0 in memory on vm-58f13156:40101 (size: 57.4 KiB, free: 2.2 GiB) 2024-06-12 07:05:33,818 INFO TaskSetManager [task-result-getter-3]: Finished task 0.0 in stage 3.0 (TID 3) in 162 ms on vm-58f13156 (executor 1) (1/1) 2024-06-12 07:05:33,819 INFO YarnClusterScheduler [task-result-getter-3]: Removed TaskSet 3.0, whose tasks have all completed, from pool 2024-06-12 07:05:33,819 INFO DAGScheduler [dag-scheduler-event-loop]: ResultStage 3 (parquet at NativeMethodAccessorImpl.java:0) finished in 0.188 s 2024-06-12 07:05:33,820 INFO DAGScheduler [dag-scheduler-event-loop]: Job 3 is finished. Cancelling potential speculative or zombie tasks for this job 2024-06-12 07:05:33,821 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Killing all running tasks in stage 3: Stage finished 2024-06-12 07:05:33,821 INFO DAGScheduler [Thread-47]: Job 3 finished: parquet at NativeMethodAccessorImpl.java:0, took 0.193746 s 2024-06-12 07:05:33,833 INFO CosmosItemsDataSource [Thread-47]: Instantiated CosmosItemsDataSource 2024-06-12 07:05:33,835 INFO CosmosItemsDataSource [Thread-47]: Instantiated CosmosItemsDataSource 2024-06-12 07:05:37,735 INFO HadoopFSUtils [Thread-47]: Listing leaf files and directories in parallel under 105 paths. The first several paths are: wasbs://loy-ds-fraud@dsmlstoragexfsyt.blob.core.windows.net/prod/fraud/data/AggPurchasePvt/2023-2024/0/0/AggPurchasePvt.parquet, wasbs://loy-ds-fraud@dsmlstoragexfsyt.blob.core.windows.net/prod/fraud/data/AggPurchasePvt/2024/03/01/AggPurchaseHeader-a7658f80-c198-4653-ba7f-8e0055320cf9.parquet, wasbs://loy-ds-fraud@dsmlstoragexfsyt.blob.core.windows.net/prod/fraud/data/AggPurchasePvt/2024/03/02/AggPurchaseHeader-3a75f612-c959-469d-9ed4-ffb264e91df8.parquet, wasbs://loy-ds-fraud@dsmlstoragexfsyt.blob.core.windows.net/prod/fraud/data/AggPurchasePvt/2024/03/03/AggPurchaseHeader-424115cf-8713-4ff2-aed1-c9e236773cdb.parquet, wasbs://loy-ds-fraud@dsmlstoragexfsyt.blob.core.windows.net/prod/fraud/data/AggPurchasePvt/2024/03/04/AggPurchaseHeader-76ff6ef3-ac3f-4f5a-8044-801848830e37.parquet, wasbs://loy-ds-fraud@dsmlstoragexfsyt.blob.core.windows.net/prod/fraud/data/AggPurchasePvt/2024/03/05/AggPurchaseHeader-34188563-4ab9-4ca4-9ae5-4e81e84b37fb.parquet, wasbs://loy-ds-fraud@dsmlstoragexfsyt.blob.core.windows.net/prod/fraud/data/AggPurchasePvt/2024/03/06/AggPurchaseHeader-b18ee06b-7e2e-48e9-ab5b-415e18d9ccc4.parquet, wasbs://loy-ds-fraud@dsmlstoragexfsyt.blob.core.windows.net/prod/fraud/data/AggPurchasePvt/2024/03/07/AggPurchaseHeader-99db7788-4800-41df-ac1b-6204b9be4c83.parquet, wasbs://loy-ds-fraud@dsmlstoragexfsyt.blob.core.windows.net/prod/fraud/data/AggPurchasePvt/2024/03/08/AggPurchaseHeader-80919921-583a-45be-85f6-795a5b313015.parquet, wasbs://loy-ds-fraud@dsmlstoragexfsyt.blob.core.windows.net/prod/fraud/data/AggPurchasePvt/2024/03/09/AggPurchaseHeader-ea045948-63a4-4bbd-8b95-d15750d93ccc.parquet. 2024-06-12 07:05:37,800 INFO SparkContext [Thread-47]: Starting job: parquet at NativeMethodAccessorImpl.java:0 2024-06-12 07:05:37,801 INFO DAGScheduler [dag-scheduler-event-loop]: Got job 4 (parquet at NativeMethodAccessorImpl.java:0) with 105 output partitions 2024-06-12 07:05:37,801 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ResultStage 4 (parquet at NativeMethodAccessorImpl.java:0) 2024-06-12 07:05:37,802 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List() 2024-06-12 07:05:37,802 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:05:37,802 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ResultStage 4 (MapPartitionsRDD[10] at parquet at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:05:37,828 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_4 stored as values in memory (estimated size 206.5 KiB, free 3.0 GiB) 2024-06-12 07:05:37,831 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_4_piece0 stored as bytes in memory (estimated size 57.5 KiB, free 3.0 GiB) 2024-06-12 07:05:37,831 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_4_piece0 in memory on vm-58f13156:42761 (size: 57.5 KiB, free: 3.0 GiB) 2024-06-12 07:05:37,832 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 4 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:05:37,833 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 105 missing tasks from ResultStage 4 (MapPartitionsRDD[10] at parquet at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)) 2024-06-12 07:05:37,833 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 4.0 with 105 tasks resource profile 0 2024-06-12 07:05:37,835 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 4.0 (TID 4) (vm-58f13156, executor 1, partition 0, PROCESS_LOCAL, 7827 bytes) taskResourceAssignments Map() 2024-06-12 07:05:37,836 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 1.0 in stage 4.0 (TID 5) (vm-14223739, executor 2, partition 1, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:37,836 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 2.0 in stage 4.0 (TID 6) (vm-58f13156, executor 1, partition 2, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:37,836 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 3.0 in stage 4.0 (TID 7) (vm-14223739, executor 2, partition 3, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:37,837 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 4.0 in stage 4.0 (TID 8) (vm-58f13156, executor 1, partition 4, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:37,837 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 5.0 in stage 4.0 (TID 9) (vm-14223739, executor 2, partition 5, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:37,837 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 6.0 in stage 4.0 (TID 10) (vm-58f13156, executor 1, partition 6, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:37,838 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 7.0 in stage 4.0 (TID 11) (vm-14223739, executor 2, partition 7, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:37,863 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_4_piece0 in memory on vm-14223739:44757 (size: 57.5 KiB, free: 2.2 GiB) 2024-06-12 07:05:37,863 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_4_piece0 in memory on vm-58f13156:40101 (size: 57.5 KiB, free: 2.2 GiB) 2024-06-12 07:05:37,976 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 8.0 in stage 4.0 (TID 12) (vm-58f13156, executor 1, partition 8, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,000 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 9.0 in stage 4.0 (TID 13) (vm-14223739, executor 2, partition 9, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,000 INFO TaskSetManager [task-result-getter-0]: Finished task 0.0 in stage 4.0 (TID 4) in 165 ms on vm-58f13156 (executor 1) (1/105) 2024-06-12 07:05:38,002 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 10.0 in stage 4.0 (TID 14) (vm-58f13156, executor 1, partition 10, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,002 INFO TaskSetManager [task-result-getter-1]: Finished task 5.0 in stage 4.0 (TID 9) in 165 ms on vm-14223739 (executor 2) (2/105) 2024-06-12 07:05:38,002 INFO TaskSetManager [task-result-getter-2]: Finished task 4.0 in stage 4.0 (TID 8) in 165 ms on vm-58f13156 (executor 1) (3/105) 2024-06-12 07:05:38,004 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 11.0 in stage 4.0 (TID 15) (vm-58f13156, executor 1, partition 11, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,004 INFO TaskSetManager [task-result-getter-3]: Finished task 2.0 in stage 4.0 (TID 6) in 168 ms on vm-58f13156 (executor 1) (4/105) 2024-06-12 07:05:38,007 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 12.0 in stage 4.0 (TID 16) (vm-58f13156, executor 1, partition 12, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,007 INFO TaskSetManager [task-result-getter-0]: Finished task 6.0 in stage 4.0 (TID 10) in 170 ms on vm-58f13156 (executor 1) (5/105) 2024-06-12 07:05:38,017 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 13.0 in stage 4.0 (TID 17) (vm-58f13156, executor 1, partition 13, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,019 INFO TaskSetManager [task-result-getter-1]: Finished task 8.0 in stage 4.0 (TID 12) in 42 ms on vm-58f13156 (executor 1) (6/105) 2024-06-12 07:05:38,019 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 14.0 in stage 4.0 (TID 18) (vm-14223739, executor 2, partition 14, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,021 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 15.0 in stage 4.0 (TID 19) (vm-14223739, executor 2, partition 15, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,023 INFO TaskSetManager [task-result-getter-2]: Finished task 7.0 in stage 4.0 (TID 11) in 185 ms on vm-14223739 (executor 2) (7/105) 2024-06-12 07:05:38,027 INFO TaskSetManager [task-result-getter-3]: Finished task 1.0 in stage 4.0 (TID 5) in 189 ms on vm-14223739 (executor 2) (8/105) 2024-06-12 07:05:38,034 INFO TaskSetManager [task-result-getter-0]: Finished task 3.0 in stage 4.0 (TID 7) in 198 ms on vm-14223739 (executor 2) (9/105) 2024-06-12 07:05:38,036 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 16.0 in stage 4.0 (TID 20) (vm-14223739, executor 2, partition 16, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,038 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 17.0 in stage 4.0 (TID 21) (vm-14223739, executor 2, partition 17, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,039 INFO TaskSetManager [task-result-getter-1]: Finished task 9.0 in stage 4.0 (TID 13) in 57 ms on vm-14223739 (executor 2) (10/105) 2024-06-12 07:05:38,041 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 18.0 in stage 4.0 (TID 22) (vm-58f13156, executor 1, partition 18, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,042 INFO TaskSetManager [task-result-getter-2]: Finished task 10.0 in stage 4.0 (TID 14) in 41 ms on vm-58f13156 (executor 1) (11/105) 2024-06-12 07:05:38,045 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 19.0 in stage 4.0 (TID 23) (vm-58f13156, executor 1, partition 19, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,047 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 20.0 in stage 4.0 (TID 24) (vm-58f13156, executor 1, partition 20, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,047 INFO TaskSetManager [task-result-getter-0]: Finished task 12.0 in stage 4.0 (TID 16) in 41 ms on vm-58f13156 (executor 1) (12/105) 2024-06-12 07:05:38,047 INFO TaskSetManager [task-result-getter-3]: Finished task 11.0 in stage 4.0 (TID 15) in 44 ms on vm-58f13156 (executor 1) (13/105) 2024-06-12 07:05:38,051 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 21.0 in stage 4.0 (TID 25) (vm-14223739, executor 2, partition 21, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,051 INFO TaskSetManager [task-result-getter-1]: Finished task 14.0 in stage 4.0 (TID 18) in 32 ms on vm-14223739 (executor 2) (14/105) 2024-06-12 07:05:38,063 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 22.0 in stage 4.0 (TID 26) (vm-58f13156, executor 1, partition 22, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,063 INFO TaskSetManager [task-result-getter-2]: Finished task 13.0 in stage 4.0 (TID 17) in 46 ms on vm-58f13156 (executor 1) (15/105) 2024-06-12 07:05:38,070 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 23.0 in stage 4.0 (TID 27) (vm-14223739, executor 2, partition 23, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,070 INFO TaskSetManager [task-result-getter-0]: Finished task 16.0 in stage 4.0 (TID 20) in 35 ms on vm-14223739 (executor 2) (16/105) 2024-06-12 07:05:38,072 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 24.0 in stage 4.0 (TID 28) (vm-14223739, executor 2, partition 24, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,074 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 25.0 in stage 4.0 (TID 29) (vm-14223739, executor 2, partition 25, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,074 INFO TaskSetManager [task-result-getter-3]: Finished task 17.0 in stage 4.0 (TID 21) in 36 ms on vm-14223739 (executor 2) (17/105) 2024-06-12 07:05:38,074 INFO TaskSetManager [task-result-getter-1]: Finished task 15.0 in stage 4.0 (TID 19) in 53 ms on vm-14223739 (executor 2) (18/105) 2024-06-12 07:05:38,087 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 26.0 in stage 4.0 (TID 30) (vm-14223739, executor 2, partition 26, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,087 INFO TaskSetManager [task-result-getter-2]: Finished task 21.0 in stage 4.0 (TID 25) in 36 ms on vm-14223739 (executor 2) (19/105) 2024-06-12 07:05:38,090 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 27.0 in stage 4.0 (TID 31) (vm-58f13156, executor 1, partition 27, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,091 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 28.0 in stage 4.0 (TID 32) (vm-58f13156, executor 1, partition 28, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,092 INFO TaskSetManager [task-result-getter-0]: Finished task 20.0 in stage 4.0 (TID 24) in 46 ms on vm-58f13156 (executor 1) (20/105) 2024-06-12 07:05:38,092 INFO TaskSetManager [task-result-getter-3]: Finished task 18.0 in stage 4.0 (TID 22) in 51 ms on vm-58f13156 (executor 1) (21/105) 2024-06-12 07:05:38,096 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 29.0 in stage 4.0 (TID 33) (vm-58f13156, executor 1, partition 29, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,096 INFO TaskSetManager [task-result-getter-1]: Finished task 19.0 in stage 4.0 (TID 23) in 51 ms on vm-58f13156 (executor 1) (22/105) 2024-06-12 07:05:38,099 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 30.0 in stage 4.0 (TID 34) (vm-14223739, executor 2, partition 30, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,100 INFO TaskSetManager [task-result-getter-2]: Finished task 23.0 in stage 4.0 (TID 27) in 30 ms on vm-14223739 (executor 2) (23/105) 2024-06-12 07:05:38,102 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 31.0 in stage 4.0 (TID 35) (vm-58f13156, executor 1, partition 31, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,102 INFO TaskSetManager [task-result-getter-0]: Finished task 22.0 in stage 4.0 (TID 26) in 39 ms on vm-58f13156 (executor 1) (24/105) 2024-06-12 07:05:38,122 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 32.0 in stage 4.0 (TID 36) (vm-14223739, executor 2, partition 32, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,122 INFO TaskSetManager [task-result-getter-3]: Finished task 25.0 in stage 4.0 (TID 29) in 49 ms on vm-14223739 (executor 2) (25/105) 2024-06-12 07:05:38,123 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 33.0 in stage 4.0 (TID 37) (vm-14223739, executor 2, partition 33, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,124 INFO TaskSetManager [task-result-getter-1]: Finished task 24.0 in stage 4.0 (TID 28) in 53 ms on vm-14223739 (executor 2) (26/105) 2024-06-12 07:05:38,133 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 34.0 in stage 4.0 (TID 38) (vm-14223739, executor 2, partition 34, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,135 INFO TaskSetManager [task-result-getter-2]: Finished task 26.0 in stage 4.0 (TID 30) in 49 ms on vm-14223739 (executor 2) (27/105) 2024-06-12 07:05:38,136 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 35.0 in stage 4.0 (TID 39) (vm-58f13156, executor 1, partition 35, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,137 INFO TaskSetManager [task-result-getter-0]: Finished task 31.0 in stage 4.0 (TID 35) in 36 ms on vm-58f13156 (executor 1) (28/105) 2024-06-12 07:05:38,138 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 36.0 in stage 4.0 (TID 40) (vm-58f13156, executor 1, partition 36, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,139 INFO TaskSetManager [task-result-getter-3]: Finished task 27.0 in stage 4.0 (TID 31) in 51 ms on vm-58f13156 (executor 1) (29/105) 2024-06-12 07:05:38,140 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 37.0 in stage 4.0 (TID 41) (vm-58f13156, executor 1, partition 37, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,140 INFO TaskSetManager [task-result-getter-1]: Finished task 28.0 in stage 4.0 (TID 32) in 49 ms on vm-58f13156 (executor 1) (30/105) 2024-06-12 07:05:38,142 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 38.0 in stage 4.0 (TID 42) (vm-58f13156, executor 1, partition 38, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,142 INFO TaskSetManager [task-result-getter-2]: Finished task 29.0 in stage 4.0 (TID 33) in 48 ms on vm-58f13156 (executor 1) (31/105) 2024-06-12 07:05:38,145 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 39.0 in stage 4.0 (TID 43) (vm-14223739, executor 2, partition 39, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,146 INFO TaskSetManager [task-result-getter-0]: Finished task 30.0 in stage 4.0 (TID 34) in 47 ms on vm-14223739 (executor 2) (32/105) 2024-06-12 07:05:38,166 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 40.0 in stage 4.0 (TID 44) (vm-14223739, executor 2, partition 40, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,166 INFO TaskSetManager [task-result-getter-3]: Finished task 32.0 in stage 4.0 (TID 36) in 45 ms on vm-14223739 (executor 2) (33/105) 2024-06-12 07:05:38,168 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 41.0 in stage 4.0 (TID 45) (vm-58f13156, executor 1, partition 41, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,168 INFO TaskSetManager [task-result-getter-1]: Finished task 36.0 in stage 4.0 (TID 40) in 30 ms on vm-58f13156 (executor 1) (34/105) 2024-06-12 07:05:38,169 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 42.0 in stage 4.0 (TID 46) (vm-58f13156, executor 1, partition 42, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,169 INFO TaskSetManager [task-result-getter-2]: Finished task 35.0 in stage 4.0 (TID 39) in 33 ms on vm-58f13156 (executor 1) (35/105) 2024-06-12 07:05:38,170 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 43.0 in stage 4.0 (TID 47) (vm-58f13156, executor 1, partition 43, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,170 INFO TaskSetManager [task-result-getter-0]: Finished task 37.0 in stage 4.0 (TID 41) in 30 ms on vm-58f13156 (executor 1) (36/105) 2024-06-12 07:05:38,171 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 44.0 in stage 4.0 (TID 48) (vm-58f13156, executor 1, partition 44, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,172 INFO TaskSetManager [task-result-getter-3]: Finished task 38.0 in stage 4.0 (TID 42) in 31 ms on vm-58f13156 (executor 1) (37/105) 2024-06-12 07:05:38,175 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 45.0 in stage 4.0 (TID 49) (vm-14223739, executor 2, partition 45, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,176 INFO TaskSetManager [task-result-getter-1]: Finished task 39.0 in stage 4.0 (TID 43) in 32 ms on vm-14223739 (executor 2) (38/105) 2024-06-12 07:05:38,179 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 46.0 in stage 4.0 (TID 50) (vm-14223739, executor 2, partition 46, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,180 INFO TaskSetManager [task-result-getter-2]: Finished task 33.0 in stage 4.0 (TID 37) in 57 ms on vm-14223739 (executor 2) (39/105) 2024-06-12 07:05:38,181 INFO TaskSetManager [task-result-getter-0]: Finished task 34.0 in stage 4.0 (TID 38) in 48 ms on vm-14223739 (executor 2) (40/105) 2024-06-12 07:05:38,184 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 47.0 in stage 4.0 (TID 51) (vm-14223739, executor 2, partition 47, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,203 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 48.0 in stage 4.0 (TID 52) (vm-14223739, executor 2, partition 48, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,203 INFO TaskSetManager [task-result-getter-3]: Finished task 40.0 in stage 4.0 (TID 44) in 38 ms on vm-14223739 (executor 2) (41/105) 2024-06-12 07:05:38,209 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 49.0 in stage 4.0 (TID 53) (vm-58f13156, executor 1, partition 49, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,210 INFO TaskSetManager [task-result-getter-1]: Finished task 41.0 in stage 4.0 (TID 45) in 43 ms on vm-58f13156 (executor 1) (42/105) 2024-06-12 07:05:38,212 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 50.0 in stage 4.0 (TID 54) (vm-58f13156, executor 1, partition 50, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,213 INFO TaskSetManager [task-result-getter-2]: Finished task 43.0 in stage 4.0 (TID 47) in 43 ms on vm-58f13156 (executor 1) (43/105) 2024-06-12 07:05:38,214 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 51.0 in stage 4.0 (TID 55) (vm-58f13156, executor 1, partition 51, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,214 INFO TaskSetManager [task-result-getter-0]: Finished task 42.0 in stage 4.0 (TID 46) in 45 ms on vm-58f13156 (executor 1) (44/105) 2024-06-12 07:05:38,215 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 52.0 in stage 4.0 (TID 56) (vm-58f13156, executor 1, partition 52, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,215 INFO TaskSetManager [task-result-getter-3]: Finished task 44.0 in stage 4.0 (TID 48) in 44 ms on vm-58f13156 (executor 1) (45/105) 2024-06-12 07:05:38,223 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 53.0 in stage 4.0 (TID 57) (vm-14223739, executor 2, partition 53, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,224 INFO TaskSetManager [task-result-getter-1]: Finished task 45.0 in stage 4.0 (TID 49) in 49 ms on vm-14223739 (executor 2) (46/105) 2024-06-12 07:05:38,225 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 54.0 in stage 4.0 (TID 58) (vm-14223739, executor 2, partition 54, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,225 INFO TaskSetManager [task-result-getter-2]: Finished task 46.0 in stage 4.0 (TID 50) in 46 ms on vm-14223739 (executor 2) (47/105) 2024-06-12 07:05:38,226 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 55.0 in stage 4.0 (TID 59) (vm-14223739, executor 2, partition 55, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,227 INFO TaskSetManager [task-result-getter-0]: Finished task 47.0 in stage 4.0 (TID 51) in 42 ms on vm-14223739 (executor 2) (48/105) 2024-06-12 07:05:38,232 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 56.0 in stage 4.0 (TID 60) (vm-14223739, executor 2, partition 56, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,233 INFO TaskSetManager [task-result-getter-3]: Finished task 48.0 in stage 4.0 (TID 52) in 30 ms on vm-14223739 (executor 2) (49/105) 2024-06-12 07:05:38,251 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 57.0 in stage 4.0 (TID 61) (vm-58f13156, executor 1, partition 57, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,251 INFO TaskSetManager [task-result-getter-1]: Finished task 50.0 in stage 4.0 (TID 54) in 39 ms on vm-58f13156 (executor 1) (50/105) 2024-06-12 07:05:38,252 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 58.0 in stage 4.0 (TID 62) (vm-58f13156, executor 1, partition 58, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,253 INFO TaskSetManager [task-result-getter-2]: Finished task 49.0 in stage 4.0 (TID 53) in 44 ms on vm-58f13156 (executor 1) (51/105) 2024-06-12 07:05:38,253 INFO TaskSetManager [task-result-getter-0]: Finished task 52.0 in stage 4.0 (TID 56) in 38 ms on vm-58f13156 (executor 1) (52/105) 2024-06-12 07:05:38,254 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 59.0 in stage 4.0 (TID 63) (vm-58f13156, executor 1, partition 59, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,263 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 60.0 in stage 4.0 (TID 64) (vm-58f13156, executor 1, partition 60, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,263 INFO TaskSetManager [task-result-getter-3]: Finished task 51.0 in stage 4.0 (TID 55) in 50 ms on vm-58f13156 (executor 1) (53/105) 2024-06-12 07:05:38,264 INFO TaskSetManager [task-result-getter-1]: Finished task 54.0 in stage 4.0 (TID 58) in 39 ms on vm-14223739 (executor 2) (54/105) 2024-06-12 07:05:38,267 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 61.0 in stage 4.0 (TID 65) (vm-14223739, executor 2, partition 61, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,268 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 62.0 in stage 4.0 (TID 66) (vm-14223739, executor 2, partition 62, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,269 INFO TaskSetManager [task-result-getter-2]: Finished task 53.0 in stage 4.0 (TID 57) in 46 ms on vm-14223739 (executor 2) (55/105) 2024-06-12 07:05:38,272 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 63.0 in stage 4.0 (TID 67) (vm-14223739, executor 2, partition 63, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,273 INFO TaskSetManager [task-result-getter-0]: Finished task 55.0 in stage 4.0 (TID 59) in 47 ms on vm-14223739 (executor 2) (56/105) 2024-06-12 07:05:38,274 INFO TaskSetManager [task-result-getter-3]: Finished task 56.0 in stage 4.0 (TID 60) in 42 ms on vm-14223739 (executor 2) (57/105) 2024-06-12 07:05:38,275 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 64.0 in stage 4.0 (TID 68) (vm-14223739, executor 2, partition 64, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,286 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 65.0 in stage 4.0 (TID 69) (vm-58f13156, executor 1, partition 65, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,287 INFO TaskSetManager [task-result-getter-1]: Finished task 57.0 in stage 4.0 (TID 61) in 41 ms on vm-58f13156 (executor 1) (58/105) 2024-06-12 07:05:38,296 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 66.0 in stage 4.0 (TID 70) (vm-58f13156, executor 1, partition 66, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,297 INFO TaskSetManager [task-result-getter-2]: Finished task 59.0 in stage 4.0 (TID 63) in 43 ms on vm-58f13156 (executor 1) (59/105) 2024-06-12 07:05:38,303 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 67.0 in stage 4.0 (TID 71) (vm-58f13156, executor 1, partition 67, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,303 INFO TaskSetManager [task-result-getter-0]: Finished task 58.0 in stage 4.0 (TID 62) in 51 ms on vm-58f13156 (executor 1) (60/105) 2024-06-12 07:05:38,304 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 68.0 in stage 4.0 (TID 72) (vm-14223739, executor 2, partition 68, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,304 INFO TaskSetManager [task-result-getter-3]: Finished task 61.0 in stage 4.0 (TID 65) in 38 ms on vm-14223739 (executor 2) (61/105) 2024-06-12 07:05:38,306 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 69.0 in stage 4.0 (TID 73) (vm-58f13156, executor 1, partition 69, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,306 INFO TaskSetManager [task-result-getter-1]: Finished task 60.0 in stage 4.0 (TID 64) in 44 ms on vm-58f13156 (executor 1) (62/105) 2024-06-12 07:05:38,312 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 70.0 in stage 4.0 (TID 74) (vm-14223739, executor 2, partition 70, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,313 INFO TaskSetManager [task-result-getter-2]: Finished task 62.0 in stage 4.0 (TID 66) in 45 ms on vm-14223739 (executor 2) (63/105) 2024-06-12 07:05:38,315 INFO TaskSetManager [task-result-getter-0]: Finished task 64.0 in stage 4.0 (TID 68) in 41 ms on vm-14223739 (executor 2) (64/105) 2024-06-12 07:05:38,317 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 71.0 in stage 4.0 (TID 75) (vm-14223739, executor 2, partition 71, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,318 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 72.0 in stage 4.0 (TID 76) (vm-14223739, executor 2, partition 72, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,318 INFO TaskSetManager [task-result-getter-3]: Finished task 63.0 in stage 4.0 (TID 67) in 48 ms on vm-14223739 (executor 2) (65/105) 2024-06-12 07:05:38,320 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 73.0 in stage 4.0 (TID 77) (vm-58f13156, executor 1, partition 73, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,321 INFO TaskSetManager [task-result-getter-1]: Finished task 65.0 in stage 4.0 (TID 69) in 35 ms on vm-58f13156 (executor 1) (66/105) 2024-06-12 07:05:38,326 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 74.0 in stage 4.0 (TID 78) (vm-58f13156, executor 1, partition 74, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,326 INFO TaskSetManager [task-result-getter-2]: Finished task 66.0 in stage 4.0 (TID 70) in 30 ms on vm-58f13156 (executor 1) (67/105) 2024-06-12 07:05:38,336 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 75.0 in stage 4.0 (TID 79) (vm-58f13156, executor 1, partition 75, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,336 INFO TaskSetManager [task-result-getter-0]: Finished task 67.0 in stage 4.0 (TID 71) in 34 ms on vm-58f13156 (executor 1) (68/105) 2024-06-12 07:05:38,341 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 76.0 in stage 4.0 (TID 80) (vm-58f13156, executor 1, partition 76, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,344 INFO TaskSetManager [task-result-getter-3]: Finished task 69.0 in stage 4.0 (TID 73) in 38 ms on vm-58f13156 (executor 1) (69/105) 2024-06-12 07:05:38,348 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 77.0 in stage 4.0 (TID 81) (vm-14223739, executor 2, partition 77, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,349 INFO TaskSetManager [task-result-getter-1]: Finished task 68.0 in stage 4.0 (TID 72) in 45 ms on vm-14223739 (executor 2) (70/105) 2024-06-12 07:05:38,352 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 78.0 in stage 4.0 (TID 82) (vm-14223739, executor 2, partition 78, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,352 INFO TaskSetManager [task-result-getter-2]: Finished task 71.0 in stage 4.0 (TID 75) in 36 ms on vm-14223739 (executor 2) (71/105) 2024-06-12 07:05:38,353 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 79.0 in stage 4.0 (TID 83) (vm-14223739, executor 2, partition 79, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,353 INFO TaskSetManager [task-result-getter-0]: Finished task 70.0 in stage 4.0 (TID 74) in 41 ms on vm-14223739 (executor 2) (72/105) 2024-06-12 07:05:38,354 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 80.0 in stage 4.0 (TID 84) (vm-14223739, executor 2, partition 80, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,355 INFO TaskSetManager [task-result-getter-3]: Finished task 72.0 in stage 4.0 (TID 76) in 37 ms on vm-14223739 (executor 2) (73/105) 2024-06-12 07:05:38,357 INFO TaskSetManager [task-result-getter-1]: Finished task 73.0 in stage 4.0 (TID 77) in 37 ms on vm-58f13156 (executor 1) (74/105) 2024-06-12 07:05:38,359 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 81.0 in stage 4.0 (TID 85) (vm-58f13156, executor 1, partition 81, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,362 INFO TaskSetManager [task-result-getter-2]: Finished task 74.0 in stage 4.0 (TID 78) in 36 ms on vm-58f13156 (executor 1) (75/105) 2024-06-12 07:05:38,364 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 82.0 in stage 4.0 (TID 86) (vm-58f13156, executor 1, partition 82, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,377 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 83.0 in stage 4.0 (TID 87) (vm-58f13156, executor 1, partition 83, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,379 INFO TaskSetManager [task-result-getter-0]: Finished task 75.0 in stage 4.0 (TID 79) in 43 ms on vm-58f13156 (executor 1) (76/105) 2024-06-12 07:05:38,387 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 84.0 in stage 4.0 (TID 88) (vm-14223739, executor 2, partition 84, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,391 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 85.0 in stage 4.0 (TID 89) (vm-14223739, executor 2, partition 85, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,392 INFO TaskSetManager [task-result-getter-3]: Finished task 77.0 in stage 4.0 (TID 81) in 45 ms on vm-14223739 (executor 2) (77/105) 2024-06-12 07:05:38,392 INFO TaskSetManager [task-result-getter-1]: Finished task 78.0 in stage 4.0 (TID 82) in 41 ms on vm-14223739 (executor 2) (78/105) 2024-06-12 07:05:38,396 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 86.0 in stage 4.0 (TID 90) (vm-14223739, executor 2, partition 86, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,397 INFO TaskSetManager [task-result-getter-2]: Finished task 79.0 in stage 4.0 (TID 83) in 44 ms on vm-14223739 (executor 2) (79/105) 2024-06-12 07:05:38,397 INFO TaskSetManager [task-result-getter-0]: Finished task 80.0 in stage 4.0 (TID 84) in 43 ms on vm-14223739 (executor 2) (80/105) 2024-06-12 07:05:38,399 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 87.0 in stage 4.0 (TID 91) (vm-14223739, executor 2, partition 87, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,399 INFO TaskSetManager [task-result-getter-3]: Finished task 76.0 in stage 4.0 (TID 80) in 58 ms on vm-58f13156 (executor 1) (81/105) 2024-06-12 07:05:38,403 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 88.0 in stage 4.0 (TID 92) (vm-58f13156, executor 1, partition 88, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,405 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 89.0 in stage 4.0 (TID 93) (vm-58f13156, executor 1, partition 89, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,405 INFO TaskSetManager [task-result-getter-1]: Finished task 81.0 in stage 4.0 (TID 85) in 46 ms on vm-58f13156 (executor 1) (82/105) 2024-06-12 07:05:38,422 INFO TaskSetManager [task-result-getter-2]: Finished task 83.0 in stage 4.0 (TID 87) in 45 ms on vm-58f13156 (executor 1) (83/105) 2024-06-12 07:05:38,424 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 90.0 in stage 4.0 (TID 94) (vm-58f13156, executor 1, partition 90, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,426 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 91.0 in stage 4.0 (TID 95) (vm-58f13156, executor 1, partition 91, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,427 INFO TaskSetManager [task-result-getter-0]: Finished task 82.0 in stage 4.0 (TID 86) in 63 ms on vm-58f13156 (executor 1) (84/105) 2024-06-12 07:05:38,429 INFO TaskSetManager [task-result-getter-3]: Finished task 85.0 in stage 4.0 (TID 89) in 40 ms on vm-14223739 (executor 2) (85/105) 2024-06-12 07:05:38,431 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 92.0 in stage 4.0 (TID 96) (vm-14223739, executor 2, partition 92, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,432 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 93.0 in stage 4.0 (TID 97) (vm-14223739, executor 2, partition 93, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,435 INFO TaskSetManager [task-result-getter-1]: Finished task 84.0 in stage 4.0 (TID 88) in 49 ms on vm-14223739 (executor 2) (86/105) 2024-06-12 07:05:38,437 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 94.0 in stage 4.0 (TID 98) (vm-14223739, executor 2, partition 94, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,437 INFO TaskSetManager [task-result-getter-2]: Finished task 87.0 in stage 4.0 (TID 91) in 39 ms on vm-14223739 (executor 2) (87/105) 2024-06-12 07:05:38,444 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 95.0 in stage 4.0 (TID 99) (vm-14223739, executor 2, partition 95, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,444 INFO TaskSetManager [task-result-getter-0]: Finished task 86.0 in stage 4.0 (TID 90) in 48 ms on vm-14223739 (executor 2) (88/105) 2024-06-12 07:05:38,467 INFO TaskSetManager [task-result-getter-3]: Finished task 88.0 in stage 4.0 (TID 92) in 66 ms on vm-58f13156 (executor 1) (89/105) 2024-06-12 07:05:38,469 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 96.0 in stage 4.0 (TID 100) (vm-58f13156, executor 1, partition 96, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,474 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 97.0 in stage 4.0 (TID 101) (vm-58f13156, executor 1, partition 97, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,475 INFO TaskSetManager [task-result-getter-1]: Finished task 91.0 in stage 4.0 (TID 95) in 49 ms on vm-58f13156 (executor 1) (90/105) 2024-06-12 07:05:38,477 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 98.0 in stage 4.0 (TID 102) (vm-58f13156, executor 1, partition 98, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,478 INFO TaskSetManager [task-result-getter-2]: Finished task 90.0 in stage 4.0 (TID 94) in 54 ms on vm-58f13156 (executor 1) (91/105) 2024-06-12 07:05:38,480 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 99.0 in stage 4.0 (TID 103) (vm-14223739, executor 2, partition 99, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,480 INFO TaskSetManager [task-result-getter-0]: Finished task 93.0 in stage 4.0 (TID 97) in 48 ms on vm-14223739 (executor 2) (92/105) 2024-06-12 07:05:38,481 INFO TaskSetManager [task-result-getter-3]: Finished task 92.0 in stage 4.0 (TID 96) in 51 ms on vm-14223739 (executor 2) (93/105) 2024-06-12 07:05:38,483 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 100.0 in stage 4.0 (TID 104) (vm-14223739, executor 2, partition 100, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,484 INFO TaskSetManager [task-result-getter-1]: Finished task 89.0 in stage 4.0 (TID 93) in 80 ms on vm-58f13156 (executor 1) (94/105) 2024-06-12 07:05:38,487 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 101.0 in stage 4.0 (TID 105) (vm-58f13156, executor 1, partition 101, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,492 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 102.0 in stage 4.0 (TID 106) (vm-14223739, executor 2, partition 102, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,494 INFO TaskSetManager [task-result-getter-2]: Finished task 95.0 in stage 4.0 (TID 99) in 56 ms on vm-14223739 (executor 2) (95/105) 2024-06-12 07:05:38,495 INFO TaskSetManager [task-result-getter-0]: Finished task 94.0 in stage 4.0 (TID 98) in 58 ms on vm-14223739 (executor 2) (96/105) 2024-06-12 07:05:38,497 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 103.0 in stage 4.0 (TID 107) (vm-14223739, executor 2, partition 103, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,516 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 104.0 in stage 4.0 (TID 108) (vm-58f13156, executor 1, partition 104, PROCESS_LOCAL, 7864 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,517 INFO TaskSetManager [task-result-getter-3]: Finished task 96.0 in stage 4.0 (TID 100) in 49 ms on vm-58f13156 (executor 1) (97/105) 2024-06-12 07:05:38,529 INFO TaskSetManager [task-result-getter-1]: Finished task 97.0 in stage 4.0 (TID 101) in 55 ms on vm-58f13156 (executor 1) (98/105) 2024-06-12 07:05:38,529 INFO TaskSetManager [task-result-getter-0]: Finished task 99.0 in stage 4.0 (TID 103) in 50 ms on vm-14223739 (executor 2) (99/105) 2024-06-12 07:05:38,529 INFO TaskSetManager [task-result-getter-2]: Finished task 100.0 in stage 4.0 (TID 104) in 46 ms on vm-14223739 (executor 2) (100/105) 2024-06-12 07:05:38,536 INFO TaskSetManager [task-result-getter-3]: Finished task 102.0 in stage 4.0 (TID 106) in 45 ms on vm-14223739 (executor 2) (101/105) 2024-06-12 07:05:38,538 INFO TaskSetManager [task-result-getter-1]: Finished task 103.0 in stage 4.0 (TID 107) in 42 ms on vm-14223739 (executor 2) (102/105) 2024-06-12 07:05:38,539 INFO TaskSetManager [task-result-getter-0]: Finished task 98.0 in stage 4.0 (TID 102) in 62 ms on vm-58f13156 (executor 1) (103/105) 2024-06-12 07:05:38,545 INFO TaskSetManager [task-result-getter-2]: Finished task 104.0 in stage 4.0 (TID 108) in 29 ms on vm-58f13156 (executor 1) (104/105) 2024-06-12 07:05:38,565 INFO TaskSetManager [task-result-getter-3]: Finished task 101.0 in stage 4.0 (TID 105) in 79 ms on vm-58f13156 (executor 1) (105/105) 2024-06-12 07:05:38,565 INFO YarnClusterScheduler [task-result-getter-3]: Removed TaskSet 4.0, whose tasks have all completed, from pool 2024-06-12 07:05:38,566 INFO DAGScheduler [dag-scheduler-event-loop]: ResultStage 4 (parquet at NativeMethodAccessorImpl.java:0) finished in 0.759 s 2024-06-12 07:05:38,567 INFO DAGScheduler [dag-scheduler-event-loop]: Job 4 is finished. Cancelling potential speculative or zombie tasks for this job 2024-06-12 07:05:38,567 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Killing all running tasks in stage 4: Stage finished 2024-06-12 07:05:38,567 INFO DAGScheduler [Thread-47]: Job 4 finished: parquet at NativeMethodAccessorImpl.java:0, took 0.767315 s 2024-06-12 07:05:38,586 INFO InMemoryFileIndex [Thread-47]: It took 852 ms to list leaf files for 105 paths. 2024-06-12 07:05:38,697 INFO SparkContext [Thread-47]: Starting job: parquet at NativeMethodAccessorImpl.java:0 2024-06-12 07:05:38,698 INFO DAGScheduler [dag-scheduler-event-loop]: Got job 5 (parquet at NativeMethodAccessorImpl.java:0) with 1 output partitions 2024-06-12 07:05:38,698 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ResultStage 5 (parquet at NativeMethodAccessorImpl.java:0) 2024-06-12 07:05:38,698 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List() 2024-06-12 07:05:38,699 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:05:38,699 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ResultStage 5 (MapPartitionsRDD[12] at parquet at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:05:38,725 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_5 stored as values in memory (estimated size 206.6 KiB, free 3.0 GiB) 2024-06-12 07:05:38,730 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_5_piece0 stored as bytes in memory (estimated size 57.4 KiB, free 3.0 GiB) 2024-06-12 07:05:38,731 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_5_piece0 in memory on vm-58f13156:42761 (size: 57.4 KiB, free: 3.0 GiB) 2024-06-12 07:05:38,732 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 5 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:05:38,732 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 1 missing tasks from ResultStage 5 (MapPartitionsRDD[12] at parquet at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0)) 2024-06-12 07:05:38,732 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 5.0 with 1 tasks resource profile 0 2024-06-12 07:05:38,735 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 5.0 (TID 109) (vm-58f13156, executor 1, partition 0, PROCESS_LOCAL, 7837 bytes) taskResourceAssignments Map() 2024-06-12 07:05:38,758 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_5_piece0 in memory on vm-58f13156:40101 (size: 57.4 KiB, free: 2.2 GiB) 2024-06-12 07:05:38,911 INFO TaskSetManager [task-result-getter-1]: Finished task 0.0 in stage 5.0 (TID 109) in 177 ms on vm-58f13156 (executor 1) (1/1) 2024-06-12 07:05:38,912 INFO YarnClusterScheduler [task-result-getter-1]: Removed TaskSet 5.0, whose tasks have all completed, from pool 2024-06-12 07:05:38,912 INFO DAGScheduler [dag-scheduler-event-loop]: ResultStage 5 (parquet at NativeMethodAccessorImpl.java:0) finished in 0.212 s 2024-06-12 07:05:38,912 INFO DAGScheduler [dag-scheduler-event-loop]: Job 5 is finished. Cancelling potential speculative or zombie tasks for this job 2024-06-12 07:05:38,913 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Killing all running tasks in stage 5: Stage finished 2024-06-12 07:05:38,913 INFO DAGScheduler [Thread-47]: Job 5 finished: parquet at NativeMethodAccessorImpl.java:0, took 0.215457 s 2024-06-12 07:05:38,955 INFO CosmosItemsDataSource [Thread-47]: Instantiated CosmosItemsDataSource 2024-06-12 07:05:38,956 INFO CosmosItemsDataSource [Thread-47]: Instantiated CosmosItemsDataSource 2024-06-12 07:05:38,988 INFO InMemoryFileIndex [Thread-47]: It took 9 ms to list leaf files for 1 paths. 2024-06-12 07:05:39,068 INFO SparkContext [Thread-47]: Starting job: parquet at NativeMethodAccessorImpl.java:0 2024-06-12 07:05:39,068 INFO DAGScheduler [dag-scheduler-event-loop]: Got job 6 (parquet at NativeMethodAccessorImpl.java:0) with 1 output partitions 2024-06-12 07:05:39,069 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ResultStage 6 (parquet at NativeMethodAccessorImpl.java:0) 2024-06-12 07:05:39,069 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List() 2024-06-12 07:05:39,069 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:05:39,069 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ResultStage 6 (MapPartitionsRDD[14] at parquet at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:05:39,088 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_6 stored as values in memory (estimated size 206.6 KiB, free 3.0 GiB) 2024-06-12 07:05:39,090 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_6_piece0 stored as bytes in memory (estimated size 57.4 KiB, free 3.0 GiB) 2024-06-12 07:05:39,091 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_6_piece0 in memory on vm-58f13156:42761 (size: 57.4 KiB, free: 3.0 GiB) 2024-06-12 07:05:39,092 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 6 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:05:39,092 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 1 missing tasks from ResultStage 6 (MapPartitionsRDD[14] at parquet at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0)) 2024-06-12 07:05:39,092 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 6.0 with 1 tasks resource profile 0 2024-06-12 07:05:39,094 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 6.0 (TID 110) (vm-58f13156, executor 1, partition 0, PROCESS_LOCAL, 7808 bytes) taskResourceAssignments Map() 2024-06-12 07:05:39,106 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_6_piece0 in memory on vm-58f13156:40101 (size: 57.4 KiB, free: 2.2 GiB) 2024-06-12 07:05:39,125 INFO TaskSetManager [task-result-getter-0]: Finished task 0.0 in stage 6.0 (TID 110) in 32 ms on vm-58f13156 (executor 1) (1/1) 2024-06-12 07:05:39,125 INFO YarnClusterScheduler [task-result-getter-0]: Removed TaskSet 6.0, whose tasks have all completed, from pool 2024-06-12 07:05:39,125 INFO DAGScheduler [dag-scheduler-event-loop]: ResultStage 6 (parquet at NativeMethodAccessorImpl.java:0) finished in 0.055 s 2024-06-12 07:05:39,126 INFO DAGScheduler [dag-scheduler-event-loop]: Job 6 is finished. Cancelling potential speculative or zombie tasks for this job 2024-06-12 07:05:39,126 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Killing all running tasks in stage 6: Stage finished 2024-06-12 07:05:39,126 INFO DAGScheduler [Thread-47]: Job 6 finished: parquet at NativeMethodAccessorImpl.java:0, took 0.058199 s 2024-06-12 07:05:39,561 INFO VegasOptimizerRule$ [Thread-47]: Cache size Some(0) 2024-06-12 07:05:39,574 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:39,574 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:39,575 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:39,582 INFO VegasOptimizerRule$ [Thread-47]: Cache size Some(0) 2024-06-12 07:05:39,582 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:39,582 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:39,582 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:39,609 INFO VegasOptimizerRule$ [Thread-47]: Cache size Some(0) 2024-06-12 07:05:39,609 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:39,609 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:39,609 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:39,615 INFO VegasOptimizerRule$ [Thread-47]: Cache size Some(0) 2024-06-12 07:05:39,615 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:39,615 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:39,615 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:39,721 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_6_piece0 on vm-58f13156:42761 in memory (size: 57.4 KiB, free: 3.0 GiB) 2024-06-12 07:05:39,722 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_6_piece0 on vm-58f13156:40101 in memory (size: 57.4 KiB, free: 2.2 GiB) 2024-06-12 07:05:39,729 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_3_piece0 on vm-58f13156:42761 in memory (size: 57.4 KiB, free: 3.0 GiB) 2024-06-12 07:05:39,731 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_3_piece0 on vm-58f13156:40101 in memory (size: 57.4 KiB, free: 2.2 GiB) 2024-06-12 07:05:39,742 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_5_piece0 on vm-58f13156:42761 in memory (size: 57.4 KiB, free: 3.0 GiB) 2024-06-12 07:05:39,743 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_5_piece0 on vm-58f13156:40101 in memory (size: 57.4 KiB, free: 2.2 GiB) 2024-06-12 07:05:39,749 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_4_piece0 on vm-58f13156:42761 in memory (size: 57.5 KiB, free: 3.0 GiB) 2024-06-12 07:05:39,750 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_4_piece0 on vm-58f13156:40101 in memory (size: 57.5 KiB, free: 2.2 GiB) 2024-06-12 07:05:39,756 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_4_piece0 on vm-14223739:44757 in memory (size: 57.5 KiB, free: 2.2 GiB) 2024-06-12 07:05:39,767 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_1_piece0 on vm-58f13156:42761 in memory (size: 57.4 KiB, free: 3.0 GiB) 2024-06-12 07:05:39,768 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_1_piece0 on vm-14223739:44757 in memory (size: 57.4 KiB, free: 2.2 GiB) 2024-06-12 07:05:39,779 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_2_piece0 on vm-58f13156:42761 in memory (size: 57.3 KiB, free: 3.0 GiB) 2024-06-12 07:05:39,780 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_2_piece0 on vm-14223739:44757 in memory (size: 57.3 KiB, free: 2.2 GiB) 2024-06-12 07:05:39,973 INFO FileSourceScanPlan [Thread-47]: Pushed Filters: IsNotNull(CustomerID) 2024-06-12 07:05:39,975 INFO FileSourceScanPlan [Thread-47]: Post-Scan Filters: isnotnull(CustomerID#724L) 2024-06-12 07:05:40,053 INFO FileSourceScanPlan [Thread-47]: Pushed Filters: IsNotNull(OrderDate),GreaterThanOrEqual(OrderDate,2024-03-28) 2024-06-12 07:05:40,054 INFO FileSourceScanPlan [Thread-47]: Post-Scan Filters: isnotnull(OrderDate#734),(OrderDate#734 >= 2024-03-28) 2024-06-12 07:05:40,057 INFO FileSourceScanPlan [Thread-47]: Pushed Filters: IsNotNull(Created),GreaterThanOrEqual(Created,2024-05-12 00:00:00.0) 2024-06-12 07:05:40,059 INFO FileSourceScanPlan [Thread-47]: Post-Scan Filters: isnotnull(Created#762),(Created#762 >= 2024-05-12 00:00:00) 2024-06-12 07:05:40,136 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(cast(CustomerID#737 as bigint)) does not exist 2024-06-12 07:05:40,136 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(cast(CustomerID#737 as bigint)), returning default shuffle keys 2024-06-12 07:05:40,138 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#760L) does not exist 2024-06-12 07:05:40,138 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#760L), returning default shuffle keys 2024-06-12 07:05:40,696 INFO CodeGenerator [Thread-47]: Code generated in 262.901936 ms 2024-06-12 07:05:40,710 INFO MemoryStore [Thread-47]: Block broadcast_7 stored as values in memory (estimated size 597.5 KiB, free 3.0 GiB) 2024-06-12 07:05:40,731 INFO MemoryStore [Thread-47]: Block broadcast_7_piece0 stored as bytes in memory (estimated size 55.0 KiB, free 3.0 GiB) 2024-06-12 07:05:40,732 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_7_piece0 in memory on vm-58f13156:42761 (size: 55.0 KiB, free: 3.0 GiB) 2024-06-12 07:05:40,733 INFO SparkContext [Thread-47]: Created broadcast 7 from count at NativeMethodAccessorImpl.java:0 2024-06-12 07:05:40,743 INFO FileSourceScanExec [Thread-47]: Planning scan with bin packing, max size: 25285940 bytes, open cost is considered as scanning 4194304 bytes. 2024-06-12 07:05:40,813 INFO DAGScheduler [dag-scheduler-event-loop]: Registering RDD 18 (count at NativeMethodAccessorImpl.java:0) as input to shuffle 0 2024-06-12 07:05:40,815 INFO HashAggregateExec [Thread-47]: spark.sql.codegen.aggregate.map.twolevel.enabled is set to true, but current version of codegened fast hashmap does not support this aggregate. 2024-06-12 07:05:40,818 INFO DAGScheduler [dag-scheduler-event-loop]: Got map stage job 7 (count at NativeMethodAccessorImpl.java:0) with 8 output partitions 2024-06-12 07:05:40,818 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ShuffleMapStage 7 (count at NativeMethodAccessorImpl.java:0) 2024-06-12 07:05:40,819 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List() 2024-06-12 07:05:40,819 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:05:40,820 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ShuffleMapStage 7 (MapPartitionsRDD[18] at count at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:05:40,895 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_8 stored as values in memory (estimated size 16.0 KiB, free 3.0 GiB) 2024-06-12 07:05:40,897 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_8_piece0 stored as bytes in memory (estimated size 7.5 KiB, free 3.0 GiB) 2024-06-12 07:05:40,898 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_8_piece0 in memory on vm-58f13156:42761 (size: 7.5 KiB, free: 3.0 GiB) 2024-06-12 07:05:40,898 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 8 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:05:40,900 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 8 missing tasks from ShuffleMapStage 7 (MapPartitionsRDD[18] at count at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7)) 2024-06-12 07:05:40,900 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 7.0 with 8 tasks resource profile 0 2024-06-12 07:05:40,905 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 7.0 (TID 111) (vm-58f13156, executor 1, partition 0, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:40,906 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 1.0 in stage 7.0 (TID 112) (vm-14223739, executor 2, partition 1, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:40,906 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 2.0 in stage 7.0 (TID 113) (vm-58f13156, executor 1, partition 2, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:40,906 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 3.0 in stage 7.0 (TID 114) (vm-14223739, executor 2, partition 3, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:40,907 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 4.0 in stage 7.0 (TID 115) (vm-58f13156, executor 1, partition 4, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:40,908 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 5.0 in stage 7.0 (TID 116) (vm-14223739, executor 2, partition 5, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:40,908 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 6.0 in stage 7.0 (TID 117) (vm-58f13156, executor 1, partition 6, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:40,908 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 7.0 in stage 7.0 (TID 118) (vm-14223739, executor 2, partition 7, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:40,931 INFO CodeGenerator [Thread-47]: Code generated in 73.095194 ms 2024-06-12 07:05:40,944 INFO MemoryStore [Thread-47]: Block broadcast_9 stored as values in memory (estimated size 597.6 KiB, free 3.0 GiB) 2024-06-12 07:05:40,950 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_8_piece0 in memory on vm-58f13156:40101 (size: 7.5 KiB, free: 2.2 GiB) 2024-06-12 07:05:40,972 INFO MemoryStore [Thread-47]: Block broadcast_9_piece0 stored as bytes in memory (estimated size 55.1 KiB, free 3.0 GiB) 2024-06-12 07:05:40,973 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_9_piece0 in memory on vm-58f13156:42761 (size: 55.1 KiB, free: 3.0 GiB) 2024-06-12 07:05:40,974 INFO SparkContext [Thread-47]: Created broadcast 9 from count at NativeMethodAccessorImpl.java:0 2024-06-12 07:05:40,978 INFO FileSourceScanExec [Thread-47]: Planning scan with bin packing, max size: 134217728 bytes, open cost is considered as scanning 4194304 bytes. 2024-06-12 07:05:41,025 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_8_piece0 in memory on vm-14223739:44757 (size: 7.5 KiB, free: 2.2 GiB) 2024-06-12 07:05:41,054 INFO DAGScheduler [dag-scheduler-event-loop]: Registering RDD 22 (count at NativeMethodAccessorImpl.java:0) as input to shuffle 1 2024-06-12 07:05:41,054 INFO DAGScheduler [dag-scheduler-event-loop]: Got map stage job 8 (count at NativeMethodAccessorImpl.java:0) with 17 output partitions 2024-06-12 07:05:41,055 INFO HashAggregateExec [Thread-47]: spark.sql.codegen.aggregate.map.twolevel.enabled is set to true, but current version of codegened fast hashmap does not support this aggregate. 2024-06-12 07:05:41,055 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ShuffleMapStage 8 (count at NativeMethodAccessorImpl.java:0) 2024-06-12 07:05:41,055 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List() 2024-06-12 07:05:41,055 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:05:41,056 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ShuffleMapStage 8 (MapPartitionsRDD[22] at count at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:05:41,069 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_10 stored as values in memory (estimated size 33.3 KiB, free 3.0 GiB) 2024-06-12 07:05:41,071 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_10_piece0 stored as bytes in memory (estimated size 14.6 KiB, free 3.0 GiB) 2024-06-12 07:05:41,072 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_10_piece0 in memory on vm-58f13156:42761 (size: 14.6 KiB, free: 3.0 GiB) 2024-06-12 07:05:41,073 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 10 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:05:41,074 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 17 missing tasks from ShuffleMapStage 8 (MapPartitionsRDD[22] at count at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)) 2024-06-12 07:05:41,074 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 8.0 with 17 tasks resource profile 0 2024-06-12 07:05:41,116 INFO CodeGenerator [Thread-47]: Code generated in 45.726771 ms 2024-06-12 07:05:41,122 INFO MemoryStore [Thread-47]: Block broadcast_11 stored as values in memory (estimated size 597.6 KiB, free 3.0 GiB) 2024-06-12 07:05:41,146 INFO MemoryStore [Thread-47]: Block broadcast_11_piece0 stored as bytes in memory (estimated size 55.1 KiB, free 3.0 GiB) 2024-06-12 07:05:41,147 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_11_piece0 in memory on vm-58f13156:42761 (size: 55.1 KiB, free: 3.0 GiB) 2024-06-12 07:05:41,148 INFO SparkContext [Thread-47]: Created broadcast 11 from count at NativeMethodAccessorImpl.java:0 2024-06-12 07:05:41,149 INFO FileSourceScanExec [Thread-47]: Planning scan with bin packing, max size: 25285940 bytes, open cost is considered as scanning 4194304 bytes. 2024-06-12 07:05:41,162 INFO DAGScheduler [dag-scheduler-event-loop]: Registering RDD 26 (count at NativeMethodAccessorImpl.java:0) as input to shuffle 2 2024-06-12 07:05:41,163 INFO DAGScheduler [dag-scheduler-event-loop]: Got map stage job 9 (count at NativeMethodAccessorImpl.java:0) with 8 output partitions 2024-06-12 07:05:41,163 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ShuffleMapStage 9 (count at NativeMethodAccessorImpl.java:0) 2024-06-12 07:05:41,163 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List() 2024-06-12 07:05:41,163 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:05:41,163 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ShuffleMapStage 9 (MapPartitionsRDD[26] at count at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:05:41,182 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_12 stored as values in memory (estimated size 33.3 KiB, free 3.0 GiB) 2024-06-12 07:05:41,184 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_12_piece0 stored as bytes in memory (estimated size 14.6 KiB, free 3.0 GiB) 2024-06-12 07:05:41,185 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_12_piece0 in memory on vm-58f13156:42761 (size: 14.6 KiB, free: 3.0 GiB) 2024-06-12 07:05:41,185 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 12 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:05:41,186 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 8 missing tasks from ShuffleMapStage 9 (MapPartitionsRDD[26] at count at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7)) 2024-06-12 07:05:41,186 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 9.0 with 8 tasks resource profile 0 2024-06-12 07:05:41,717 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_7_piece0 in memory on vm-58f13156:40101 (size: 55.0 KiB, free: 2.2 GiB) 2024-06-12 07:05:42,198 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 8.0 (TID 119) (vm-58f13156, executor 1, partition 0, PROCESS_LOCAL, 8354 bytes) taskResourceAssignments Map() 2024-06-12 07:05:42,200 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 1.0 in stage 8.0 (TID 120) (vm-58f13156, executor 1, partition 1, PROCESS_LOCAL, 8354 bytes) taskResourceAssignments Map() 2024-06-12 07:05:42,202 INFO TaskSetManager [task-result-getter-3]: Finished task 4.0 in stage 7.0 (TID 115) in 1294 ms on vm-58f13156 (executor 1) (1/8) 2024-06-12 07:05:42,202 INFO TaskSetManager [task-result-getter-2]: Finished task 0.0 in stage 7.0 (TID 111) in 1301 ms on vm-58f13156 (executor 1) (2/8) 2024-06-12 07:05:42,220 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_10_piece0 in memory on vm-58f13156:40101 (size: 14.6 KiB, free: 2.2 GiB) 2024-06-12 07:05:42,444 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_9_piece0 in memory on vm-58f13156:40101 (size: 55.1 KiB, free: 2.2 GiB) 2024-06-12 07:05:42,777 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 2.0 in stage 8.0 (TID 121) (vm-58f13156, executor 1, partition 2, PROCESS_LOCAL, 8354 bytes) taskResourceAssignments Map() 2024-06-12 07:05:42,778 INFO TaskSetManager [task-result-getter-1]: Finished task 1.0 in stage 8.0 (TID 120) in 578 ms on vm-58f13156 (executor 1) (1/17) 2024-06-12 07:05:42,779 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 3.0 in stage 8.0 (TID 122) (vm-58f13156, executor 1, partition 3, PROCESS_LOCAL, 8354 bytes) taskResourceAssignments Map() 2024-06-12 07:05:42,779 INFO TaskSetManager [task-result-getter-0]: Finished task 0.0 in stage 8.0 (TID 119) in 581 ms on vm-58f13156 (executor 1) (2/17) 2024-06-12 07:05:42,856 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 4.0 in stage 8.0 (TID 123) (vm-58f13156, executor 1, partition 4, PROCESS_LOCAL, 8354 bytes) taskResourceAssignments Map() 2024-06-12 07:05:42,856 INFO TaskSetManager [task-result-getter-3]: Finished task 3.0 in stage 8.0 (TID 122) in 77 ms on vm-58f13156 (executor 1) (3/17) 2024-06-12 07:05:42,858 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 5.0 in stage 8.0 (TID 124) (vm-58f13156, executor 1, partition 5, PROCESS_LOCAL, 8354 bytes) taskResourceAssignments Map() 2024-06-12 07:05:42,859 INFO TaskSetManager [task-result-getter-2]: Finished task 2.0 in stage 8.0 (TID 121) in 81 ms on vm-58f13156 (executor 1) (4/17) 2024-06-12 07:05:42,931 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 6.0 in stage 8.0 (TID 125) (vm-58f13156, executor 1, partition 6, PROCESS_LOCAL, 8354 bytes) taskResourceAssignments Map() 2024-06-12 07:05:42,931 INFO TaskSetManager [task-result-getter-1]: Finished task 4.0 in stage 8.0 (TID 123) in 76 ms on vm-58f13156 (executor 1) (5/17) 2024-06-12 07:05:42,963 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 7.0 in stage 8.0 (TID 126) (vm-58f13156, executor 1, partition 7, PROCESS_LOCAL, 8354 bytes) taskResourceAssignments Map() 2024-06-12 07:05:42,963 INFO TaskSetManager [task-result-getter-0]: Finished task 5.0 in stage 8.0 (TID 124) in 106 ms on vm-58f13156 (executor 1) (6/17) 2024-06-12 07:05:43,010 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 8.0 in stage 8.0 (TID 127) (vm-58f13156, executor 1, partition 8, PROCESS_LOCAL, 8354 bytes) taskResourceAssignments Map() 2024-06-12 07:05:43,011 INFO TaskSetManager [task-result-getter-3]: Finished task 6.0 in stage 8.0 (TID 125) in 81 ms on vm-58f13156 (executor 1) (7/17) 2024-06-12 07:05:43,035 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 9.0 in stage 8.0 (TID 128) (vm-58f13156, executor 1, partition 9, PROCESS_LOCAL, 8354 bytes) taskResourceAssignments Map() 2024-06-12 07:05:43,037 INFO TaskSetManager [task-result-getter-2]: Finished task 7.0 in stage 8.0 (TID 126) in 75 ms on vm-58f13156 (executor 1) (8/17) 2024-06-12 07:05:43,090 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_7_piece0 in memory on vm-14223739:44757 (size: 55.0 KiB, free: 2.2 GiB) 2024-06-12 07:05:43,091 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 10.0 in stage 8.0 (TID 129) (vm-58f13156, executor 1, partition 10, PROCESS_LOCAL, 11084 bytes) taskResourceAssignments Map() 2024-06-12 07:05:43,092 INFO TaskSetManager [task-result-getter-1]: Finished task 8.0 in stage 8.0 (TID 127) in 82 ms on vm-58f13156 (executor 1) (9/17) 2024-06-12 07:05:43,129 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 11.0 in stage 8.0 (TID 130) (vm-58f13156, executor 1, partition 11, PROCESS_LOCAL, 11751 bytes) taskResourceAssignments Map() 2024-06-12 07:05:43,129 INFO TaskSetManager [task-result-getter-0]: Finished task 9.0 in stage 8.0 (TID 128) in 94 ms on vm-58f13156 (executor 1) (10/17) 2024-06-12 07:05:43,550 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 12.0 in stage 8.0 (TID 131) (vm-14223739, executor 2, partition 12, PROCESS_LOCAL, 11751 bytes) taskResourceAssignments Map() 2024-06-12 07:05:43,551 INFO TaskSetManager [task-result-getter-3]: Finished task 5.0 in stage 7.0 (TID 116) in 2644 ms on vm-14223739 (executor 2) (3/8) 2024-06-12 07:05:43,553 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 13.0 in stage 8.0 (TID 132) (vm-14223739, executor 2, partition 13, PROCESS_LOCAL, 11961 bytes) taskResourceAssignments Map() 2024-06-12 07:05:43,554 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 14.0 in stage 8.0 (TID 133) (vm-14223739, executor 2, partition 14, PROCESS_LOCAL, 11961 bytes) taskResourceAssignments Map() 2024-06-12 07:05:43,555 INFO TaskSetManager [task-result-getter-2]: Finished task 7.0 in stage 7.0 (TID 118) in 2647 ms on vm-14223739 (executor 2) (4/8) 2024-06-12 07:05:43,555 INFO TaskSetManager [task-result-getter-1]: Finished task 3.0 in stage 7.0 (TID 114) in 2649 ms on vm-14223739 (executor 2) (5/8) 2024-06-12 07:05:43,556 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 15.0 in stage 8.0 (TID 134) (vm-14223739, executor 2, partition 15, PROCESS_LOCAL, 12381 bytes) taskResourceAssignments Map() 2024-06-12 07:05:43,556 INFO TaskSetManager [task-result-getter-0]: Finished task 1.0 in stage 7.0 (TID 112) in 2651 ms on vm-14223739 (executor 2) (6/8) 2024-06-12 07:05:43,585 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_10_piece0 in memory on vm-14223739:44757 (size: 14.6 KiB, free: 2.2 GiB) 2024-06-12 07:05:43,779 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 16.0 in stage 8.0 (TID 135) (vm-58f13156, executor 1, partition 16, PROCESS_LOCAL, 8391 bytes) taskResourceAssignments Map() 2024-06-12 07:05:43,781 INFO TaskSetManager [task-result-getter-3]: Finished task 6.0 in stage 7.0 (TID 117) in 2873 ms on vm-58f13156 (executor 1) (7/8) 2024-06-12 07:05:43,931 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 9.0 (TID 136) (vm-58f13156, executor 1, partition 0, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:43,932 INFO TaskSetManager [task-result-getter-2]: Finished task 16.0 in stage 8.0 (TID 135) in 153 ms on vm-58f13156 (executor 1) (11/17) 2024-06-12 07:05:43,941 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_12_piece0 in memory on vm-58f13156:40101 (size: 14.6 KiB, free: 2.2 GiB) 2024-06-12 07:05:44,007 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_11_piece0 in memory on vm-58f13156:40101 (size: 55.1 KiB, free: 2.2 GiB) 2024-06-12 07:05:44,076 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 1.0 in stage 9.0 (TID 137) (vm-58f13156, executor 1, partition 1, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:44,077 INFO TaskSetManager [task-result-getter-1]: Finished task 2.0 in stage 7.0 (TID 113) in 3171 ms on vm-58f13156 (executor 1) (8/8) 2024-06-12 07:05:44,077 INFO YarnClusterScheduler [task-result-getter-1]: Removed TaskSet 7.0, whose tasks have all completed, from pool 2024-06-12 07:05:44,077 INFO DAGScheduler [dag-scheduler-event-loop]: ShuffleMapStage 7 (count at NativeMethodAccessorImpl.java:0) finished in 3.250 s 2024-06-12 07:05:44,078 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 2.0 in stage 9.0 (TID 138) (vm-58f13156, executor 1, partition 2, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:44,078 INFO TaskSetManager [task-result-getter-0]: Finished task 0.0 in stage 9.0 (TID 136) in 147 ms on vm-58f13156 (executor 1) (1/8) 2024-06-12 07:05:44,078 INFO DAGScheduler [dag-scheduler-event-loop]: looking for newly runnable stages 2024-06-12 07:05:44,079 INFO DAGScheduler [dag-scheduler-event-loop]: running: Set(ShuffleMapStage 9, ShuffleMapStage 8) 2024-06-12 07:05:44,079 INFO DAGScheduler [dag-scheduler-event-loop]: waiting: Set() 2024-06-12 07:05:44,080 INFO DAGScheduler [dag-scheduler-event-loop]: failed: Set() 2024-06-12 07:05:44,096 INFO CreateAdviseEventHandler [spark-listener-group-shared]: Sending DataSkew to Advise Hub: Map(_source -> user, _jobGroupId -> -1, _detail -> null, _executionId -> -1, _stageId -> -1, _jobId -> 7, DETAIL -> org.apache.spark.advise.output.AdviseDetailWrapper@6dc944ea, _level -> warn, _stageAttemptId -> -1, _name -> Data skew for job 7) 2024-06-12 07:05:44,096 INFO KustoHandler [spark-listener-group-shared]: Logging DataSkew with appId: application_1718175835080_0001 to Kusto: Map(_source -> user, _jobGroupId -> -1, _detail -> null, _executionId -> -1, _stageId -> -1, _jobId -> 7, DETAIL -> org.apache.spark.advise.output.AdviseDetailWrapper@6dc944ea, _level -> warn, _stageAttemptId -> -1, _name -> Data skew for job 7) 2024-06-12 07:05:44,112 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:05:44,112 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:05:44,112 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for Vector(CustomerID#737) does not exist 2024-06-12 07:05:44,112 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use Vector(CustomerID#737), returning default shuffle keys 2024-06-12 07:05:44,113 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for Vector(CustomerID#760L) does not exist 2024-06-12 07:05:44,113 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use Vector(CustomerID#760L), returning default shuffle keys 2024-06-12 07:05:44,113 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(cast(CustomerID#737 as bigint)) does not exist 2024-06-12 07:05:44,113 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(cast(CustomerID#737 as bigint)), returning default shuffle keys 2024-06-12 07:05:44,113 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#760L) does not exist 2024-06-12 07:05:44,113 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#760L), returning default shuffle keys 2024-06-12 07:05:44,119 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_9_piece0 in memory on vm-14223739:44757 (size: 55.1 KiB, free: 2.2 GiB) 2024-06-12 07:05:44,157 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 3.0 in stage 9.0 (TID 139) (vm-58f13156, executor 1, partition 3, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:44,157 INFO TaskSetManager [task-result-getter-3]: Finished task 1.0 in stage 9.0 (TID 137) in 81 ms on vm-58f13156 (executor 1) (2/8) 2024-06-12 07:05:44,244 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 4.0 in stage 9.0 (TID 140) (vm-58f13156, executor 1, partition 4, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:44,245 INFO TaskSetManager [task-result-getter-2]: Finished task 3.0 in stage 9.0 (TID 139) in 88 ms on vm-58f13156 (executor 1) (3/8) 2024-06-12 07:05:44,306 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 5.0 in stage 9.0 (TID 141) (vm-58f13156, executor 1, partition 5, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:44,307 INFO TaskSetManager [task-result-getter-1]: Finished task 4.0 in stage 9.0 (TID 140) in 63 ms on vm-58f13156 (executor 1) (4/8) 2024-06-12 07:05:44,367 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 6.0 in stage 9.0 (TID 142) (vm-58f13156, executor 1, partition 6, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:44,367 INFO TaskSetManager [task-result-getter-0]: Finished task 5.0 in stage 9.0 (TID 141) in 61 ms on vm-58f13156 (executor 1) (5/8) 2024-06-12 07:05:45,544 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 7.0 in stage 9.0 (TID 143) (vm-58f13156, executor 1, partition 7, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:45,544 INFO TaskSetManager [task-result-getter-3]: Finished task 6.0 in stage 9.0 (TID 142) in 1178 ms on vm-58f13156 (executor 1) (6/8) 2024-06-12 07:05:45,607 INFO TaskSetManager [task-result-getter-2]: Finished task 7.0 in stage 9.0 (TID 143) in 63 ms on vm-58f13156 (executor 1) (7/8) 2024-06-12 07:05:45,777 INFO TaskSetManager [task-result-getter-1]: Finished task 2.0 in stage 9.0 (TID 138) in 1700 ms on vm-58f13156 (executor 1) (8/8) 2024-06-12 07:05:45,777 INFO YarnClusterScheduler [task-result-getter-1]: Removed TaskSet 9.0, whose tasks have all completed, from pool 2024-06-12 07:05:45,778 INFO DAGScheduler [dag-scheduler-event-loop]: ShuffleMapStage 9 (count at NativeMethodAccessorImpl.java:0) finished in 4.612 s 2024-06-12 07:05:45,778 INFO DAGScheduler [dag-scheduler-event-loop]: looking for newly runnable stages 2024-06-12 07:05:45,778 INFO DAGScheduler [dag-scheduler-event-loop]: running: Set(ShuffleMapStage 8) 2024-06-12 07:05:45,778 INFO DAGScheduler [dag-scheduler-event-loop]: waiting: Set() 2024-06-12 07:05:45,778 INFO DAGScheduler [dag-scheduler-event-loop]: failed: Set() 2024-06-12 07:05:45,788 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:05:45,788 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:05:45,788 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(cast(CustomerID#737 as bigint)) does not exist 2024-06-12 07:05:45,788 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(cast(CustomerID#737 as bigint)), returning default shuffle keys 2024-06-12 07:05:45,789 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#760L) does not exist 2024-06-12 07:05:45,789 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#760L), returning default shuffle keys 2024-06-12 07:05:46,060 INFO TaskSetManager [task-result-getter-0]: Finished task 10.0 in stage 8.0 (TID 129) in 2969 ms on vm-58f13156 (executor 1) (12/17) 2024-06-12 07:05:46,896 INFO TaskSetManager [task-result-getter-3]: Finished task 11.0 in stage 8.0 (TID 130) in 3767 ms on vm-58f13156 (executor 1) (13/17) 2024-06-12 07:05:48,699 INFO TaskSetManager [task-result-getter-2]: Finished task 15.0 in stage 8.0 (TID 134) in 5143 ms on vm-14223739 (executor 2) (14/17) 2024-06-12 07:05:48,718 INFO TaskSetManager [task-result-getter-1]: Finished task 12.0 in stage 8.0 (TID 131) in 5168 ms on vm-14223739 (executor 2) (15/17) 2024-06-12 07:05:49,092 INFO TaskSetManager [task-result-getter-0]: Finished task 14.0 in stage 8.0 (TID 133) in 5538 ms on vm-14223739 (executor 2) (16/17) 2024-06-12 07:05:49,269 INFO TaskSetManager [task-result-getter-3]: Finished task 13.0 in stage 8.0 (TID 132) in 5717 ms on vm-14223739 (executor 2) (17/17) 2024-06-12 07:05:49,269 INFO YarnClusterScheduler [task-result-getter-3]: Removed TaskSet 8.0, whose tasks have all completed, from pool 2024-06-12 07:05:49,270 INFO DAGScheduler [dag-scheduler-event-loop]: ShuffleMapStage 8 (count at NativeMethodAccessorImpl.java:0) finished in 8.212 s 2024-06-12 07:05:49,270 INFO DAGScheduler [dag-scheduler-event-loop]: looking for newly runnable stages 2024-06-12 07:05:49,270 INFO DAGScheduler [dag-scheduler-event-loop]: running: Set() 2024-06-12 07:05:49,270 INFO DAGScheduler [dag-scheduler-event-loop]: waiting: Set() 2024-06-12 07:05:49,270 INFO DAGScheduler [dag-scheduler-event-loop]: failed: Set() 2024-06-12 07:05:49,283 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:05:49,283 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:05:49,283 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(cast(CustomerID#737 as bigint)) does not exist 2024-06-12 07:05:49,283 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(cast(CustomerID#737 as bigint)), returning default shuffle keys 2024-06-12 07:05:49,284 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#760L) does not exist 2024-06-12 07:05:49,284 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#760L), returning default shuffle keys 2024-06-12 07:05:49,293 INFO ShufflePartitionsUtil [Thread-47]: For shuffle(1), advisory target size: 67108864, actual target size 2656857, minimum partition size: 1048576 2024-06-12 07:05:49,322 INFO HashAggregateExec [Thread-47]: spark.sql.codegen.aggregate.map.twolevel.enabled is set to true, but current version of codegened fast hashmap does not support this aggregate. 2024-06-12 07:05:49,348 INFO CodeGenerator [Thread-47]: Code generated in 22.070975 ms 2024-06-12 07:05:49,374 INFO DAGScheduler [dag-scheduler-event-loop]: Registering RDD 29 (count at NativeMethodAccessorImpl.java:0) as input to shuffle 3 2024-06-12 07:05:49,374 INFO DAGScheduler [dag-scheduler-event-loop]: Got map stage job 10 (count at NativeMethodAccessorImpl.java:0) with 8 output partitions 2024-06-12 07:05:49,374 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ShuffleMapStage 11 (count at NativeMethodAccessorImpl.java:0) 2024-06-12 07:05:49,374 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List(ShuffleMapStage 10) 2024-06-12 07:05:49,375 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:05:49,375 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ShuffleMapStage 11 (MapPartitionsRDD[29] at count at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:05:49,406 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_13 stored as values in memory (estimated size 36.4 KiB, free 3.0 GiB) 2024-06-12 07:05:49,408 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_13_piece0 stored as bytes in memory (estimated size 16.6 KiB, free 3.0 GiB) 2024-06-12 07:05:49,408 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_13_piece0 in memory on vm-58f13156:42761 (size: 16.6 KiB, free: 3.0 GiB) 2024-06-12 07:05:49,409 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 13 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:05:49,409 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 8 missing tasks from ShuffleMapStage 11 (MapPartitionsRDD[29] at count at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7)) 2024-06-12 07:05:49,410 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 11.0 with 8 tasks resource profile 0 2024-06-12 07:05:49,417 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 11.0 (TID 144) (vm-58f13156, executor 1, partition 0, NODE_LOCAL, 7809 bytes) taskResourceAssignments Map() 2024-06-12 07:05:49,418 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 1.0 in stage 11.0 (TID 145) (vm-14223739, executor 2, partition 1, NODE_LOCAL, 7809 bytes) taskResourceAssignments Map() 2024-06-12 07:05:49,418 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 2.0 in stage 11.0 (TID 146) (vm-58f13156, executor 1, partition 2, NODE_LOCAL, 7809 bytes) taskResourceAssignments Map() 2024-06-12 07:05:49,418 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 3.0 in stage 11.0 (TID 147) (vm-14223739, executor 2, partition 3, NODE_LOCAL, 7809 bytes) taskResourceAssignments Map() 2024-06-12 07:05:49,419 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 4.0 in stage 11.0 (TID 148) (vm-58f13156, executor 1, partition 4, NODE_LOCAL, 7809 bytes) taskResourceAssignments Map() 2024-06-12 07:05:49,419 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 5.0 in stage 11.0 (TID 149) (vm-14223739, executor 2, partition 5, NODE_LOCAL, 7809 bytes) taskResourceAssignments Map() 2024-06-12 07:05:49,419 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 6.0 in stage 11.0 (TID 150) (vm-58f13156, executor 1, partition 6, NODE_LOCAL, 7809 bytes) taskResourceAssignments Map() 2024-06-12 07:05:49,420 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 7.0 in stage 11.0 (TID 151) (vm-14223739, executor 2, partition 7, NODE_LOCAL, 7809 bytes) taskResourceAssignments Map() 2024-06-12 07:05:49,433 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_13_piece0 in memory on vm-58f13156:40101 (size: 16.6 KiB, free: 2.2 GiB) 2024-06-12 07:05:49,464 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_13_piece0 in memory on vm-14223739:44757 (size: 16.6 KiB, free: 2.2 GiB) 2024-06-12 07:05:49,486 INFO MapOutputTrackerMasterEndpoint [dispatcher-event-loop-0]: Asked to send map output locations for shuffle 1 to 10.0.32.6:33208 2024-06-12 07:05:49,619 INFO MapOutputTrackerMasterEndpoint [dispatcher-event-loop-1]: Asked to send map output locations for shuffle 1 to 10.0.32.4:57308 2024-06-12 07:05:50,117 INFO TaskSetManager [task-result-getter-2]: Finished task 0.0 in stage 11.0 (TID 144) in 701 ms on vm-58f13156 (executor 1) (1/8) 2024-06-12 07:05:50,117 INFO TaskSetManager [task-result-getter-1]: Finished task 2.0 in stage 11.0 (TID 146) in 699 ms on vm-58f13156 (executor 1) (2/8) 2024-06-12 07:05:50,120 INFO TaskSetManager [task-result-getter-0]: Finished task 4.0 in stage 11.0 (TID 148) in 701 ms on vm-58f13156 (executor 1) (3/8) 2024-06-12 07:05:50,124 INFO TaskSetManager [task-result-getter-3]: Finished task 6.0 in stage 11.0 (TID 150) in 705 ms on vm-58f13156 (executor 1) (4/8) 2024-06-12 07:05:50,409 INFO TaskSetManager [task-result-getter-2]: Finished task 5.0 in stage 11.0 (TID 149) in 989 ms on vm-14223739 (executor 2) (5/8) 2024-06-12 07:05:50,409 INFO TaskSetManager [task-result-getter-1]: Finished task 1.0 in stage 11.0 (TID 145) in 991 ms on vm-14223739 (executor 2) (6/8) 2024-06-12 07:05:50,428 INFO TaskSetManager [task-result-getter-0]: Finished task 3.0 in stage 11.0 (TID 147) in 1010 ms on vm-14223739 (executor 2) (7/8) 2024-06-12 07:05:50,438 INFO TaskSetManager [task-result-getter-3]: Finished task 7.0 in stage 11.0 (TID 151) in 1018 ms on vm-14223739 (executor 2) (8/8) 2024-06-12 07:05:50,438 INFO YarnClusterScheduler [task-result-getter-3]: Removed TaskSet 11.0, whose tasks have all completed, from pool 2024-06-12 07:05:50,438 INFO DAGScheduler [dag-scheduler-event-loop]: ShuffleMapStage 11 (count at NativeMethodAccessorImpl.java:0) finished in 1.044 s 2024-06-12 07:05:50,439 INFO DAGScheduler [dag-scheduler-event-loop]: looking for newly runnable stages 2024-06-12 07:05:50,439 INFO DAGScheduler [dag-scheduler-event-loop]: running: Set() 2024-06-12 07:05:50,439 INFO DAGScheduler [dag-scheduler-event-loop]: waiting: Set() 2024-06-12 07:05:50,439 INFO DAGScheduler [dag-scheduler-event-loop]: failed: Set() 2024-06-12 07:05:50,448 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:05:50,448 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:05:50,448 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(cast(CustomerID#737 as bigint)) does not exist 2024-06-12 07:05:50,448 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(cast(CustomerID#737 as bigint)), returning default shuffle keys 2024-06-12 07:05:50,448 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(cast(CustomerID#737 as bigint)) does not exist 2024-06-12 07:05:50,449 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(cast(CustomerID#737 as bigint)), returning default shuffle keys 2024-06-12 07:05:50,449 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#760L) does not exist 2024-06-12 07:05:50,449 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#760L), returning default shuffle keys 2024-06-12 07:05:50,454 INFO ShufflePartitionsUtil [Thread-47]: For shuffle(3, 2), advisory target size: 67108864, actual target size 1048576, minimum partition size: 1048576 2024-06-12 07:05:50,517 INFO CodeGenerator [Thread-47]: Code generated in 29.770526 ms 2024-06-12 07:05:50,763 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_13_piece0 on vm-58f13156:40101 in memory (size: 16.6 KiB, free: 2.2 GiB) 2024-06-12 07:05:50,764 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_13_piece0 on vm-58f13156:42761 in memory (size: 16.6 KiB, free: 3.0 GiB) 2024-06-12 07:05:50,764 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_13_piece0 on vm-14223739:44757 in memory (size: 16.6 KiB, free: 2.2 GiB) 2024-06-12 07:05:50,775 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_12_piece0 on vm-58f13156:42761 in memory (size: 14.6 KiB, free: 3.0 GiB) 2024-06-12 07:05:50,775 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_12_piece0 on vm-58f13156:40101 in memory (size: 14.6 KiB, free: 2.2 GiB) 2024-06-12 07:05:50,778 INFO CodeGenerator [Thread-47]: Code generated in 25.421393 ms 2024-06-12 07:05:50,786 INFO HashAggregateExec [Thread-47]: spark.sql.codegen.aggregate.map.twolevel.enabled is set to true, but current version of codegened fast hashmap does not support this aggregate. 2024-06-12 07:05:50,787 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_10_piece0 on vm-58f13156:42761 in memory (size: 14.6 KiB, free: 3.0 GiB) 2024-06-12 07:05:50,787 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_10_piece0 on vm-58f13156:40101 in memory (size: 14.6 KiB, free: 2.2 GiB) 2024-06-12 07:05:50,788 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_10_piece0 on vm-14223739:44757 in memory (size: 14.6 KiB, free: 2.2 GiB) 2024-06-12 07:05:50,795 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_8_piece0 on vm-58f13156:42761 in memory (size: 7.5 KiB, free: 3.0 GiB) 2024-06-12 07:05:50,795 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_8_piece0 on vm-58f13156:40101 in memory (size: 7.5 KiB, free: 2.2 GiB) 2024-06-12 07:05:50,796 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_8_piece0 on vm-14223739:44757 in memory (size: 7.5 KiB, free: 2.2 GiB) 2024-06-12 07:05:50,809 INFO CodeGenerator [Thread-47]: Code generated in 20.149952 ms 2024-06-12 07:05:50,826 INFO DAGScheduler [dag-scheduler-event-loop]: Registering RDD 36 (count at NativeMethodAccessorImpl.java:0) as input to shuffle 4 2024-06-12 07:05:50,827 INFO DAGScheduler [dag-scheduler-event-loop]: Got map stage job 11 (count at NativeMethodAccessorImpl.java:0) with 7 output partitions 2024-06-12 07:05:50,827 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ShuffleMapStage 15 (count at NativeMethodAccessorImpl.java:0) 2024-06-12 07:05:50,827 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List(ShuffleMapStage 13, ShuffleMapStage 14) 2024-06-12 07:05:50,827 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:05:50,828 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ShuffleMapStage 15 (MapPartitionsRDD[36] at count at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:05:50,842 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_14 stored as values in memory (estimated size 62.7 KiB, free 3.0 GiB) 2024-06-12 07:05:50,843 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_14_piece0 stored as bytes in memory (estimated size 24.5 KiB, free 3.0 GiB) 2024-06-12 07:05:50,844 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_14_piece0 in memory on vm-58f13156:42761 (size: 24.5 KiB, free: 3.0 GiB) 2024-06-12 07:05:50,844 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 14 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:05:50,845 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 7 missing tasks from ShuffleMapStage 15 (MapPartitionsRDD[36] at count at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6)) 2024-06-12 07:05:50,845 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 15.0 with 7 tasks resource profile 0 2024-06-12 07:05:50,848 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 15.0 (TID 152) (vm-14223739, executor 2, partition 0, NODE_LOCAL, 8091 bytes) taskResourceAssignments Map() 2024-06-12 07:05:50,849 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 1.0 in stage 15.0 (TID 153) (vm-58f13156, executor 1, partition 1, NODE_LOCAL, 8091 bytes) taskResourceAssignments Map() 2024-06-12 07:05:50,849 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 2.0 in stage 15.0 (TID 154) (vm-14223739, executor 2, partition 2, NODE_LOCAL, 8091 bytes) taskResourceAssignments Map() 2024-06-12 07:05:50,849 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 3.0 in stage 15.0 (TID 155) (vm-58f13156, executor 1, partition 3, NODE_LOCAL, 8091 bytes) taskResourceAssignments Map() 2024-06-12 07:05:50,850 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 4.0 in stage 15.0 (TID 156) (vm-14223739, executor 2, partition 4, NODE_LOCAL, 8091 bytes) taskResourceAssignments Map() 2024-06-12 07:05:50,850 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 5.0 in stage 15.0 (TID 157) (vm-58f13156, executor 1, partition 5, NODE_LOCAL, 8091 bytes) taskResourceAssignments Map() 2024-06-12 07:05:50,850 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 6.0 in stage 15.0 (TID 158) (vm-14223739, executor 2, partition 6, NODE_LOCAL, 8091 bytes) taskResourceAssignments Map() 2024-06-12 07:05:50,863 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_14_piece0 in memory on vm-58f13156:40101 (size: 24.5 KiB, free: 2.2 GiB) 2024-06-12 07:05:50,869 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_14_piece0 in memory on vm-14223739:44757 (size: 24.5 KiB, free: 2.2 GiB) 2024-06-12 07:05:50,887 INFO MapOutputTrackerMasterEndpoint [dispatcher-event-loop-0]: Asked to send map output locations for shuffle 3 to 10.0.32.6:33208 2024-06-12 07:05:50,943 INFO MapOutputTrackerMasterEndpoint [dispatcher-event-loop-1]: Asked to send map output locations for shuffle 3 to 10.0.32.4:57308 2024-06-12 07:05:50,972 INFO MapOutputTrackerMasterEndpoint [dispatcher-event-loop-0]: Asked to send map output locations for shuffle 2 to 10.0.32.6:33208 2024-06-12 07:05:51,073 INFO MapOutputTrackerMasterEndpoint [dispatcher-event-loop-1]: Asked to send map output locations for shuffle 2 to 10.0.32.4:57308 2024-06-12 07:05:51,488 INFO TaskSetManager [task-result-getter-2]: Finished task 5.0 in stage 15.0 (TID 157) in 638 ms on vm-58f13156 (executor 1) (1/7) 2024-06-12 07:05:51,518 INFO TaskSetManager [task-result-getter-1]: Finished task 1.0 in stage 15.0 (TID 153) in 669 ms on vm-58f13156 (executor 1) (2/7) 2024-06-12 07:05:51,524 INFO TaskSetManager [task-result-getter-0]: Finished task 3.0 in stage 15.0 (TID 155) in 674 ms on vm-58f13156 (executor 1) (3/7) 2024-06-12 07:05:51,768 INFO TaskSetManager [task-result-getter-3]: Finished task 4.0 in stage 15.0 (TID 156) in 918 ms on vm-14223739 (executor 2) (4/7) 2024-06-12 07:05:51,782 INFO TaskSetManager [task-result-getter-2]: Finished task 2.0 in stage 15.0 (TID 154) in 933 ms on vm-14223739 (executor 2) (5/7) 2024-06-12 07:05:51,795 INFO TaskSetManager [task-result-getter-1]: Finished task 0.0 in stage 15.0 (TID 152) in 948 ms on vm-14223739 (executor 2) (6/7) 2024-06-12 07:05:51,836 INFO TaskSetManager [task-result-getter-0]: Finished task 6.0 in stage 15.0 (TID 158) in 986 ms on vm-14223739 (executor 2) (7/7) 2024-06-12 07:05:51,836 INFO YarnClusterScheduler [task-result-getter-0]: Removed TaskSet 15.0, whose tasks have all completed, from pool 2024-06-12 07:05:51,837 INFO DAGScheduler [dag-scheduler-event-loop]: ShuffleMapStage 15 (count at NativeMethodAccessorImpl.java:0) finished in 1.002 s 2024-06-12 07:05:51,837 INFO DAGScheduler [dag-scheduler-event-loop]: looking for newly runnable stages 2024-06-12 07:05:51,837 INFO DAGScheduler [dag-scheduler-event-loop]: running: Set() 2024-06-12 07:05:51,837 INFO DAGScheduler [dag-scheduler-event-loop]: waiting: Set() 2024-06-12 07:05:51,837 INFO DAGScheduler [dag-scheduler-event-loop]: failed: Set() 2024-06-12 07:05:51,860 INFO ShufflePartitionsUtil [Thread-47]: For shuffle(4), advisory target size: 67108864, actual target size 1048576, minimum partition size: 1048576 2024-06-12 07:05:51,894 INFO SparkContext [broadcast-exchange-0]: Starting job: $anonfun$withThreadLocalCaptured$1 at FutureTask.java:266 2024-06-12 07:05:51,896 INFO DAGScheduler [dag-scheduler-event-loop]: Got job 12 ($anonfun$withThreadLocalCaptured$1 at FutureTask.java:266) with 6 output partitions 2024-06-12 07:05:51,896 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ResultStage 20 ($anonfun$withThreadLocalCaptured$1 at FutureTask.java:266) 2024-06-12 07:05:51,896 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List(ShuffleMapStage 19) 2024-06-12 07:05:51,896 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:05:51,897 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ResultStage 20 (MapPartitionsRDD[38] at $anonfun$withThreadLocalCaptured$1 at FutureTask.java:266), which has no missing parents 2024-06-12 07:05:51,900 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_15 stored as values in memory (estimated size 7.4 KiB, free 3.0 GiB) 2024-06-12 07:05:51,902 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_15_piece0 stored as bytes in memory (estimated size 4.0 KiB, free 3.0 GiB) 2024-06-12 07:05:51,902 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_15_piece0 in memory on vm-58f13156:42761 (size: 4.0 KiB, free: 3.0 GiB) 2024-06-12 07:05:51,902 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 15 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:05:51,903 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 6 missing tasks from ResultStage 20 (MapPartitionsRDD[38] at $anonfun$withThreadLocalCaptured$1 at FutureTask.java:266) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5)) 2024-06-12 07:05:51,903 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 20.0 with 6 tasks resource profile 0 2024-06-12 07:05:51,904 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 20.0 (TID 159) (vm-58f13156, executor 1, partition 0, NODE_LOCAL, 7836 bytes) taskResourceAssignments Map() 2024-06-12 07:05:51,904 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 1.0 in stage 20.0 (TID 160) (vm-14223739, executor 2, partition 1, NODE_LOCAL, 7836 bytes) taskResourceAssignments Map() 2024-06-12 07:05:51,905 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 2.0 in stage 20.0 (TID 161) (vm-58f13156, executor 1, partition 2, NODE_LOCAL, 7836 bytes) taskResourceAssignments Map() 2024-06-12 07:05:51,905 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 3.0 in stage 20.0 (TID 162) (vm-14223739, executor 2, partition 3, NODE_LOCAL, 7836 bytes) taskResourceAssignments Map() 2024-06-12 07:05:51,905 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 4.0 in stage 20.0 (TID 163) (vm-58f13156, executor 1, partition 4, NODE_LOCAL, 7836 bytes) taskResourceAssignments Map() 2024-06-12 07:05:51,905 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 5.0 in stage 20.0 (TID 164) (vm-14223739, executor 2, partition 5, NODE_LOCAL, 7836 bytes) taskResourceAssignments Map() 2024-06-12 07:05:51,916 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_15_piece0 in memory on vm-58f13156:40101 (size: 4.0 KiB, free: 2.2 GiB) 2024-06-12 07:05:51,929 INFO MapOutputTrackerMasterEndpoint [dispatcher-event-loop-0]: Asked to send map output locations for shuffle 4 to 10.0.32.6:33208 2024-06-12 07:05:51,935 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_15_piece0 in memory on vm-14223739:44757 (size: 4.0 KiB, free: 2.2 GiB) 2024-06-12 07:05:51,948 INFO MapOutputTrackerMasterEndpoint [dispatcher-event-loop-1]: Asked to send map output locations for shuffle 4 to 10.0.32.4:57308 2024-06-12 07:05:52,040 INFO TaskSetManager [task-result-getter-3]: Finished task 3.0 in stage 20.0 (TID 162) in 135 ms on vm-14223739 (executor 2) (1/6) 2024-06-12 07:05:52,045 INFO TaskSetManager [task-result-getter-2]: Finished task 1.0 in stage 20.0 (TID 160) in 141 ms on vm-14223739 (executor 2) (2/6) 2024-06-12 07:05:52,046 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added taskresult_164 in memory on vm-14223739:44757 (size: 1321.6 KiB, free: 2.2 GiB) 2024-06-12 07:05:52,057 INFO TaskSetManager [task-result-getter-0]: Finished task 2.0 in stage 20.0 (TID 161) in 153 ms on vm-58f13156 (executor 1) (3/6) 2024-06-12 07:05:52,058 INFO TaskSetManager [task-result-getter-3]: Finished task 4.0 in stage 20.0 (TID 163) in 153 ms on vm-58f13156 (executor 1) (4/6) 2024-06-12 07:05:52,071 INFO TransportClientFactory [task-result-getter-1]: Successfully created connection to vm-14223739/10.0.32.4:44757 after 4 ms (0 ms spent in bootstraps) 2024-06-12 07:05:52,087 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added taskresult_159 in memory on vm-58f13156:40101 (size: 1877.2 KiB, free: 2.2 GiB) 2024-06-12 07:05:52,096 INFO TransportClientFactory [task-result-getter-2]: Successfully created connection to vm-58f13156/10.0.32.6:40101 after 2 ms (0 ms spent in bootstraps) 2024-06-12 07:05:52,106 INFO TaskSetManager [task-result-getter-1]: Finished task 5.0 in stage 20.0 (TID 164) in 201 ms on vm-14223739 (executor 2) (5/6) 2024-06-12 07:05:52,106 INFO TaskSetManager [task-result-getter-2]: Finished task 0.0 in stage 20.0 (TID 159) in 202 ms on vm-58f13156 (executor 1) (6/6) 2024-06-12 07:05:52,106 INFO YarnClusterScheduler [task-result-getter-2]: Removed TaskSet 20.0, whose tasks have all completed, from pool 2024-06-12 07:05:52,107 INFO DAGScheduler [dag-scheduler-event-loop]: ResultStage 20 ($anonfun$withThreadLocalCaptured$1 at FutureTask.java:266) finished in 0.209 s 2024-06-12 07:05:52,107 INFO DAGScheduler [dag-scheduler-event-loop]: Job 12 is finished. Cancelling potential speculative or zombie tasks for this job 2024-06-12 07:05:52,108 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Killing all running tasks in stage 20: Stage finished 2024-06-12 07:05:52,108 INFO DAGScheduler [broadcast-exchange-0]: Job 12 finished: $anonfun$withThreadLocalCaptured$1 at FutureTask.java:266, took 0.213615 s 2024-06-12 07:05:52,111 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed taskresult_164 on vm-14223739:44757 in memory (size: 1321.6 KiB, free: 2.2 GiB) 2024-06-12 07:05:52,111 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed taskresult_159 on vm-58f13156:40101 in memory (size: 1877.2 KiB, free: 2.2 GiB) 2024-06-12 07:05:52,142 INFO CodeGenerator [broadcast-exchange-0]: Code generated in 11.83509 ms 2024-06-12 07:05:52,702 INFO MemoryStore [broadcast-exchange-0]: Block broadcast_16_piece0 stored as bytes in memory (estimated size 4.0 MiB, free 3.0 GiB) 2024-06-12 07:05:52,703 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_16_piece0 in memory on vm-58f13156:42761 (size: 4.0 MiB, free: 3.0 GiB) 2024-06-12 07:05:52,705 INFO MemoryStore [broadcast-exchange-0]: Block broadcast_16_piece1 stored as bytes in memory (estimated size 4.0 MiB, free 3.0 GiB) 2024-06-12 07:05:52,706 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_16_piece1 in memory on vm-58f13156:42761 (size: 4.0 MiB, free: 3.0 GiB) 2024-06-12 07:05:52,708 INFO MemoryStore [broadcast-exchange-0]: Block broadcast_16_piece2 stored as bytes in memory (estimated size 4.0 MiB, free 3.0 GiB) 2024-06-12 07:05:52,708 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_16_piece2 in memory on vm-58f13156:42761 (size: 4.0 MiB, free: 3.0 GiB) 2024-06-12 07:05:52,710 INFO MemoryStore [broadcast-exchange-0]: Block broadcast_16_piece3 stored as bytes in memory (estimated size 4.0 MiB, free 3.0 GiB) 2024-06-12 07:05:52,711 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_16_piece3 in memory on vm-58f13156:42761 (size: 4.0 MiB, free: 3.0 GiB) 2024-06-12 07:05:52,712 INFO MemoryStore [broadcast-exchange-0]: Block broadcast_16_piece4 stored as bytes in memory (estimated size 632.8 KiB, free 3.0 GiB) 2024-06-12 07:05:52,712 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_16_piece4 in memory on vm-58f13156:42761 (size: 632.8 KiB, free: 3.0 GiB) 2024-06-12 07:05:52,712 INFO SparkContext [broadcast-exchange-0]: Created broadcast 16 from $anonfun$withThreadLocalCaptured$1 at FutureTask.java:266 2024-06-12 07:05:52,720 INFO ShufflePartitionsUtil [Thread-47]: For shuffle(0), advisory target size: 67108864, actual target size 2485307, minimum partition size: 1048576 2024-06-12 07:05:52,766 INFO CodeGenerator [Thread-47]: Code generated in 10.792182 ms 2024-06-12 07:05:52,774 INFO DAGScheduler [dag-scheduler-event-loop]: Registering RDD 41 (count at NativeMethodAccessorImpl.java:0) as input to shuffle 5 2024-06-12 07:05:52,774 INFO DAGScheduler [dag-scheduler-event-loop]: Got map stage job 13 (count at NativeMethodAccessorImpl.java:0) with 8 output partitions 2024-06-12 07:05:52,775 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ShuffleMapStage 22 (count at NativeMethodAccessorImpl.java:0) 2024-06-12 07:05:52,775 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List(ShuffleMapStage 21) 2024-06-12 07:05:52,775 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:05:52,775 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ShuffleMapStage 22 (MapPartitionsRDD[41] at count at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:05:52,781 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_17 stored as values in memory (estimated size 14.2 KiB, free 3.0 GiB) 2024-06-12 07:05:52,782 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_17_piece0 stored as bytes in memory (estimated size 7.1 KiB, free 3.0 GiB) 2024-06-12 07:05:52,782 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_17_piece0 in memory on vm-58f13156:42761 (size: 7.1 KiB, free: 3.0 GiB) 2024-06-12 07:05:52,783 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 17 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:05:52,783 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 8 missing tasks from ShuffleMapStage 22 (MapPartitionsRDD[41] at count at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7)) 2024-06-12 07:05:52,783 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 22.0 with 8 tasks resource profile 0 2024-06-12 07:05:52,784 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 22.0 (TID 165) (vm-58f13156, executor 1, partition 0, NODE_LOCAL, 7828 bytes) taskResourceAssignments Map() 2024-06-12 07:05:52,785 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 1.0 in stage 22.0 (TID 166) (vm-14223739, executor 2, partition 1, NODE_LOCAL, 7828 bytes) taskResourceAssignments Map() 2024-06-12 07:05:52,785 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 2.0 in stage 22.0 (TID 167) (vm-58f13156, executor 1, partition 2, NODE_LOCAL, 7828 bytes) taskResourceAssignments Map() 2024-06-12 07:05:52,785 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 3.0 in stage 22.0 (TID 168) (vm-14223739, executor 2, partition 3, NODE_LOCAL, 7828 bytes) taskResourceAssignments Map() 2024-06-12 07:05:52,785 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 4.0 in stage 22.0 (TID 169) (vm-58f13156, executor 1, partition 4, NODE_LOCAL, 7828 bytes) taskResourceAssignments Map() 2024-06-12 07:05:52,786 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 5.0 in stage 22.0 (TID 170) (vm-14223739, executor 2, partition 5, NODE_LOCAL, 7828 bytes) taskResourceAssignments Map() 2024-06-12 07:05:52,786 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 6.0 in stage 22.0 (TID 171) (vm-58f13156, executor 1, partition 6, NODE_LOCAL, 7828 bytes) taskResourceAssignments Map() 2024-06-12 07:05:52,786 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 7.0 in stage 22.0 (TID 172) (vm-14223739, executor 2, partition 7, NODE_LOCAL, 7828 bytes) taskResourceAssignments Map() 2024-06-12 07:05:52,798 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_17_piece0 in memory on vm-58f13156:40101 (size: 7.1 KiB, free: 2.2 GiB) 2024-06-12 07:05:52,802 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_17_piece0 in memory on vm-14223739:44757 (size: 7.1 KiB, free: 2.2 GiB) 2024-06-12 07:05:52,806 INFO MapOutputTrackerMasterEndpoint [dispatcher-event-loop-0]: Asked to send map output locations for shuffle 0 to 10.0.32.6:33208 2024-06-12 07:05:52,810 INFO MapOutputTrackerMasterEndpoint [dispatcher-event-loop-1]: Asked to send map output locations for shuffle 0 to 10.0.32.4:57308 2024-06-12 07:05:52,838 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_16_piece3 in memory on vm-58f13156:40101 (size: 4.0 MiB, free: 2.2 GiB) 2024-06-12 07:05:52,850 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_16_piece2 in memory on vm-58f13156:40101 (size: 4.0 MiB, free: 2.2 GiB) 2024-06-12 07:05:52,856 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_16_piece4 in memory on vm-58f13156:40101 (size: 632.8 KiB, free: 2.2 GiB) 2024-06-12 07:05:52,866 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_16_piece0 in memory on vm-58f13156:40101 (size: 4.0 MiB, free: 2.2 GiB) 2024-06-12 07:05:52,876 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_16_piece1 in memory on vm-58f13156:40101 (size: 4.0 MiB, free: 2.2 GiB) 2024-06-12 07:05:52,884 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_16_piece1 in memory on vm-14223739:44757 (size: 4.0 MiB, free: 2.2 GiB) 2024-06-12 07:05:52,916 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_16_piece3 in memory on vm-14223739:44757 (size: 4.0 MiB, free: 2.2 GiB) 2024-06-12 07:05:52,927 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_16_piece4 in memory on vm-14223739:44757 (size: 632.8 KiB, free: 2.2 GiB) 2024-06-12 07:05:52,955 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_16_piece0 in memory on vm-14223739:44757 (size: 4.0 MiB, free: 2.2 GiB) 2024-06-12 07:05:52,969 INFO TaskSetManager [task-result-getter-0]: Finished task 0.0 in stage 22.0 (TID 165) in 185 ms on vm-58f13156 (executor 1) (1/8) 2024-06-12 07:05:52,969 INFO TaskSetManager [task-result-getter-3]: Finished task 4.0 in stage 22.0 (TID 169) in 184 ms on vm-58f13156 (executor 1) (2/8) 2024-06-12 07:05:52,983 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_16_piece2 in memory on vm-14223739:44757 (size: 4.0 MiB, free: 2.2 GiB) 2024-06-12 07:05:53,092 INFO TaskSetManager [task-result-getter-1]: Finished task 7.0 in stage 22.0 (TID 172) in 306 ms on vm-14223739 (executor 2) (3/8) 2024-06-12 07:05:53,092 INFO TaskSetManager [task-result-getter-2]: Finished task 1.0 in stage 22.0 (TID 166) in 307 ms on vm-14223739 (executor 2) (4/8) 2024-06-12 07:05:53,093 INFO TaskSetManager [task-result-getter-0]: Finished task 3.0 in stage 22.0 (TID 168) in 308 ms on vm-14223739 (executor 2) (5/8) 2024-06-12 07:05:53,093 INFO TaskSetManager [task-result-getter-3]: Finished task 5.0 in stage 22.0 (TID 170) in 307 ms on vm-14223739 (executor 2) (6/8) 2024-06-12 07:05:53,227 INFO TaskSetManager [task-result-getter-1]: Finished task 6.0 in stage 22.0 (TID 171) in 441 ms on vm-58f13156 (executor 1) (7/8) 2024-06-12 07:05:53,444 INFO TaskSetManager [task-result-getter-2]: Finished task 2.0 in stage 22.0 (TID 167) in 659 ms on vm-58f13156 (executor 1) (8/8) 2024-06-12 07:05:53,444 INFO YarnClusterScheduler [task-result-getter-2]: Removed TaskSet 22.0, whose tasks have all completed, from pool 2024-06-12 07:05:53,445 INFO DAGScheduler [dag-scheduler-event-loop]: ShuffleMapStage 22 (count at NativeMethodAccessorImpl.java:0) finished in 0.668 s 2024-06-12 07:05:53,445 INFO DAGScheduler [dag-scheduler-event-loop]: looking for newly runnable stages 2024-06-12 07:05:53,445 INFO DAGScheduler [dag-scheduler-event-loop]: running: Set() 2024-06-12 07:05:53,445 INFO DAGScheduler [dag-scheduler-event-loop]: waiting: Set() 2024-06-12 07:05:53,445 INFO DAGScheduler [dag-scheduler-event-loop]: failed: Set() 2024-06-12 07:05:53,449 INFO CreateAdviseEventHandler [spark-listener-group-shared]: Sending DataSkew to Advise Hub: Map(_source -> user, _jobGroupId -> -1, _detail -> null, _executionId -> -1, _stageId -> -1, _jobId -> 13, DETAIL -> org.apache.spark.advise.output.AdviseDetailWrapper@6580b764, _level -> warn, _stageAttemptId -> -1, _name -> Data skew for job 13) 2024-06-12 07:05:53,449 INFO KustoHandler [spark-listener-group-shared]: Logging DataSkew with appId: application_1718175835080_0001 to Kusto: Map(_source -> user, _jobGroupId -> -1, _detail -> null, _executionId -> -1, _stageId -> -1, _jobId -> 13, DETAIL -> org.apache.spark.advise.output.AdviseDetailWrapper@6580b764, _level -> warn, _stageAttemptId -> -1, _name -> Data skew for job 13) 2024-06-12 07:05:53,473 INFO CodeGenerator [Thread-47]: Code generated in 8.367964 ms 2024-06-12 07:05:53,485 INFO SparkContext [Thread-47]: Starting job: count at NativeMethodAccessorImpl.java:0 2024-06-12 07:05:53,486 INFO DAGScheduler [dag-scheduler-event-loop]: Got job 14 (count at NativeMethodAccessorImpl.java:0) with 1 output partitions 2024-06-12 07:05:53,486 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ResultStage 25 (count at NativeMethodAccessorImpl.java:0) 2024-06-12 07:05:53,486 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List(ShuffleMapStage 24) 2024-06-12 07:05:53,486 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:05:53,487 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ResultStage 25 (MapPartitionsRDD[44] at count at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:05:53,489 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_18 stored as values in memory (estimated size 11.5 KiB, free 3.0 GiB) 2024-06-12 07:05:53,490 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_18_piece0 stored as bytes in memory (estimated size 5.6 KiB, free 3.0 GiB) 2024-06-12 07:05:53,490 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_18_piece0 in memory on vm-58f13156:42761 (size: 5.6 KiB, free: 3.0 GiB) 2024-06-12 07:05:53,491 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 18 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:05:53,491 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 1 missing tasks from ResultStage 25 (MapPartitionsRDD[44] at count at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0)) 2024-06-12 07:05:53,491 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 25.0 with 1 tasks resource profile 0 2024-06-12 07:05:53,493 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 25.0 (TID 173) (vm-14223739, executor 2, partition 0, NODE_LOCAL, 7820 bytes) taskResourceAssignments Map() 2024-06-12 07:05:53,508 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_18_piece0 in memory on vm-14223739:44757 (size: 5.6 KiB, free: 2.2 GiB) 2024-06-12 07:05:53,512 INFO MapOutputTrackerMasterEndpoint [dispatcher-event-loop-0]: Asked to send map output locations for shuffle 5 to 10.0.32.4:57308 2024-06-12 07:05:53,535 INFO TaskSetManager [task-result-getter-0]: Finished task 0.0 in stage 25.0 (TID 173) in 43 ms on vm-14223739 (executor 2) (1/1) 2024-06-12 07:05:53,535 INFO YarnClusterScheduler [task-result-getter-0]: Removed TaskSet 25.0, whose tasks have all completed, from pool 2024-06-12 07:05:53,536 INFO DAGScheduler [dag-scheduler-event-loop]: ResultStage 25 (count at NativeMethodAccessorImpl.java:0) finished in 0.048 s 2024-06-12 07:05:53,536 INFO DAGScheduler [dag-scheduler-event-loop]: Job 14 is finished. Cancelling potential speculative or zombie tasks for this job 2024-06-12 07:05:53,536 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Killing all running tasks in stage 25: Stage finished 2024-06-12 07:05:53,536 INFO DAGScheduler [Thread-47]: Job 14 finished: count at NativeMethodAccessorImpl.java:0, took 0.051344 s 2024-06-12 07:05:53,551 INFO CatalogTablePartitionCache [Thread-47]: Maximum partition cache size is is 30 MB 2024-06-12 07:05:53,554 INFO SharedCatalogPartitionInMemoryCache [Thread-47]: Max cache size is 32212254 2024-06-12 07:05:57,671 INFO SparkContext [Thread-47]: Added file /tmp/tmpvk6bf_1j.zip at spark://vm-58f13156:42075/files/tmpvk6bf_1j.zip with timestamp 1718175957670 2024-06-12 07:05:57,671 INFO Utils [Thread-47]: Copying /tmp/tmpvk6bf_1j.zip to /mnt/var/hadoop/tmp/nm-local-dir/usercache/trusted-service-user/appcache/application_1718175835080_0001/spark-b62d55fb-1409-4495-86aa-9ee72645b5c7/userFiles-c0bc9305-f04e-4ef9-8f39-b0593e405d3f/tmpvk6bf_1j.zip 2024-06-12 07:05:58,407 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_17_piece0 on vm-58f13156:42761 in memory (size: 7.1 KiB, free: 3.0 GiB) 2024-06-12 07:05:58,409 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_17_piece0 on vm-58f13156:40101 in memory (size: 7.1 KiB, free: 2.2 GiB) 2024-06-12 07:05:58,410 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_17_piece0 on vm-14223739:44757 in memory (size: 7.1 KiB, free: 2.2 GiB) 2024-06-12 07:05:58,415 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_18_piece0 on vm-58f13156:42761 in memory (size: 5.6 KiB, free: 3.0 GiB) 2024-06-12 07:05:58,416 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_18_piece0 on vm-14223739:44757 in memory (size: 5.6 KiB, free: 2.2 GiB) 2024-06-12 07:05:58,421 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_15_piece0 on vm-58f13156:42761 in memory (size: 4.0 KiB, free: 3.0 GiB) 2024-06-12 07:05:58,422 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_15_piece0 on vm-58f13156:40101 in memory (size: 4.0 KiB, free: 2.2 GiB) 2024-06-12 07:05:58,422 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_15_piece0 on vm-14223739:44757 in memory (size: 4.0 KiB, free: 2.2 GiB) 2024-06-12 07:05:58,429 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_14_piece0 on vm-58f13156:42761 in memory (size: 24.5 KiB, free: 3.0 GiB) 2024-06-12 07:05:58,430 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_14_piece0 on vm-58f13156:40101 in memory (size: 24.5 KiB, free: 2.2 GiB) 2024-06-12 07:05:58,431 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_14_piece0 on vm-14223739:44757 in memory (size: 24.5 KiB, free: 2.2 GiB) 2024-06-12 07:05:58,864 INFO VegasOptimizerRule$ [Thread-47]: Cache size Some(0) 2024-06-12 07:05:58,864 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,865 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,865 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,865 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,865 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,865 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,881 INFO VegasOptimizerRule$ [Thread-47]: Cache size Some(0) 2024-06-12 07:05:58,881 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,881 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,881 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,881 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,881 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,881 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,906 INFO VegasOptimizerRule$ [Thread-47]: Cache size Some(0) 2024-06-12 07:05:58,906 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,906 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,906 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,906 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,906 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,906 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,924 INFO VegasOptimizerRule$ [Thread-47]: Cache size Some(0) 2024-06-12 07:05:58,924 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,924 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,924 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,924 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,924 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:58,925 INFO VegasOptimizerRule$ [Thread-47]: Vegas cache size 0 disables the cache 2024-06-12 07:05:59,016 INFO BaseAllocator [Thread-47]: Debug mode disabled. 2024-06-12 07:05:59,020 INFO DefaultAllocationManagerOption [Thread-47]: allocation manager type not specified, using netty as the default type 2024-06-12 07:05:59,021 INFO CheckAllocator [Thread-47]: Using DefaultAllocationManager at memory-netty-7.0.0.jar!/org/apache/arrow/memory/DefaultAllocationManagerFactory.class 2024-06-12 07:05:59,058 INFO FileSourceScanPlan [Thread-47]: Pushed Filters: IsNotNull(CustomerID) 2024-06-12 07:05:59,058 INFO FileSourceScanPlan [Thread-47]: Post-Scan Filters: isnotnull(CustomerID#724L) 2024-06-12 07:05:59,060 INFO FileSourceScanPlan [Thread-47]: Pushed Filters: IsNotNull(OrderDate),GreaterThanOrEqual(OrderDate,2024-03-28) 2024-06-12 07:05:59,061 INFO FileSourceScanPlan [Thread-47]: Post-Scan Filters: isnotnull(OrderDate#734),(OrderDate#734 >= 2024-03-28) 2024-06-12 07:05:59,062 INFO FileSourceScanPlan [Thread-47]: Pushed Filters: IsNotNull(Created),GreaterThanOrEqual(Created,2024-05-12 00:00:00.0) 2024-06-12 07:05:59,062 INFO FileSourceScanPlan [Thread-47]: Post-Scan Filters: isnotnull(Created#762),(Created#762 >= 2024-05-12 00:00:00) 2024-06-12 07:05:59,065 INFO FileSourceScanPlan [Thread-47]: Pushed Filters: IsNotNull(CustomerID) 2024-06-12 07:05:59,065 INFO FileSourceScanPlan [Thread-47]: Post-Scan Filters: isnotnull(CustomerID#33L) 2024-06-12 07:05:59,067 INFO FileSourceScanPlan [Thread-47]: Pushed Filters: IsNotNull(CustomerID) 2024-06-12 07:05:59,067 INFO FileSourceScanPlan [Thread-47]: Post-Scan Filters: isnotnull(CustomerID#537L) 2024-06-12 07:05:59,076 INFO FileSourceScanPlan [Thread-47]: Pushed Filters: IsNotNull(Status),IsNotNull(PartnerRoleID),EqualTo(Status,true),Not(EqualTo(PartnerRoleID,88)) 2024-06-12 07:05:59,076 INFO FileSourceScanPlan [Thread-47]: Post-Scan Filters: isnotnull(Status#10),isnotnull(PartnerRoleID#11L),Status#10,NOT (PartnerRoleID#11L = 88) 2024-06-12 07:05:59,088 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(cast(CustomerID#737 as bigint)) does not exist 2024-06-12 07:05:59,088 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(cast(CustomerID#737 as bigint)), returning default shuffle keys 2024-06-12 07:05:59,088 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:05:59,088 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:05:59,089 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:05:59,089 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:05:59,089 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:05:59,089 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:05:59,089 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:05:59,089 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:05:59,090 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#760L) does not exist 2024-06-12 07:05:59,090 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#760L), returning default shuffle keys 2024-06-12 07:05:59,114 WARN package [Thread-47]: Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'. 2024-06-12 07:05:59,194 INFO MemoryStore [Thread-47]: Block broadcast_19 stored as values in memory (estimated size 597.5 KiB, free 3.0 GiB) 2024-06-12 07:05:59,214 INFO MemoryStore [Thread-47]: Block broadcast_19_piece0 stored as bytes in memory (estimated size 55.0 KiB, free 3.0 GiB) 2024-06-12 07:05:59,214 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_19_piece0 in memory on vm-58f13156:42761 (size: 55.0 KiB, free: 3.0 GiB) 2024-06-12 07:05:59,215 INFO SparkContext [Thread-47]: Created broadcast 19 from showString at NativeMethodAccessorImpl.java:0 2024-06-12 07:05:59,216 INFO FileSourceScanExec [Thread-47]: Planning scan with bin packing, max size: 25285940 bytes, open cost is considered as scanning 4194304 bytes. 2024-06-12 07:05:59,224 INFO DAGScheduler [dag-scheduler-event-loop]: Registering RDD 48 (showString at NativeMethodAccessorImpl.java:0) as input to shuffle 6 2024-06-12 07:05:59,225 INFO HashAggregateExec [Thread-47]: spark.sql.codegen.aggregate.map.twolevel.enabled is set to true, but current version of codegened fast hashmap does not support this aggregate. 2024-06-12 07:05:59,225 INFO DAGScheduler [dag-scheduler-event-loop]: Got map stage job 15 (showString at NativeMethodAccessorImpl.java:0) with 8 output partitions 2024-06-12 07:05:59,225 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ShuffleMapStage 26 (showString at NativeMethodAccessorImpl.java:0) 2024-06-12 07:05:59,225 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List() 2024-06-12 07:05:59,225 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:05:59,225 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ShuffleMapStage 26 (MapPartitionsRDD[48] at showString at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:05:59,230 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_20 stored as values in memory (estimated size 16.0 KiB, free 3.0 GiB) 2024-06-12 07:05:59,232 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_20_piece0 stored as bytes in memory (estimated size 7.5 KiB, free 3.0 GiB) 2024-06-12 07:05:59,232 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_20_piece0 in memory on vm-58f13156:42761 (size: 7.5 KiB, free: 3.0 GiB) 2024-06-12 07:05:59,233 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 20 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:05:59,233 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 8 missing tasks from ShuffleMapStage 26 (MapPartitionsRDD[48] at showString at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7)) 2024-06-12 07:05:59,234 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 26.0 with 8 tasks resource profile 0 2024-06-12 07:05:59,235 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 26.0 (TID 174) (vm-58f13156, executor 1, partition 0, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,235 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 1.0 in stage 26.0 (TID 175) (vm-14223739, executor 2, partition 1, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,235 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 2.0 in stage 26.0 (TID 176) (vm-58f13156, executor 1, partition 2, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,235 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 3.0 in stage 26.0 (TID 177) (vm-14223739, executor 2, partition 3, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,236 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 4.0 in stage 26.0 (TID 178) (vm-58f13156, executor 1, partition 4, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,236 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 5.0 in stage 26.0 (TID 179) (vm-14223739, executor 2, partition 5, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,236 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 6.0 in stage 26.0 (TID 180) (vm-58f13156, executor 1, partition 6, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,236 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 7.0 in stage 26.0 (TID 181) (vm-14223739, executor 2, partition 7, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,239 INFO MemoryStore [Thread-47]: Block broadcast_21 stored as values in memory (estimated size 597.6 KiB, free 3.0 GiB) 2024-06-12 07:05:59,271 INFO MemoryStore [Thread-47]: Block broadcast_21_piece0 stored as bytes in memory (estimated size 55.1 KiB, free 3.0 GiB) 2024-06-12 07:05:59,272 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_21_piece0 in memory on vm-58f13156:42761 (size: 55.1 KiB, free: 3.0 GiB) 2024-06-12 07:05:59,273 INFO SparkContext [Thread-47]: Created broadcast 21 from showString at NativeMethodAccessorImpl.java:0 2024-06-12 07:05:59,278 INFO FileSourceScanExec [Thread-47]: Planning scan with bin packing, max size: 134217728 bytes, open cost is considered as scanning 4194304 bytes. 2024-06-12 07:05:59,295 INFO DAGScheduler [dag-scheduler-event-loop]: Registering RDD 52 (showString at NativeMethodAccessorImpl.java:0) as input to shuffle 7 2024-06-12 07:05:59,295 INFO HashAggregateExec [Thread-47]: spark.sql.codegen.aggregate.map.twolevel.enabled is set to true, but current version of codegened fast hashmap does not support this aggregate. 2024-06-12 07:05:59,295 INFO DAGScheduler [dag-scheduler-event-loop]: Got map stage job 16 (showString at NativeMethodAccessorImpl.java:0) with 17 output partitions 2024-06-12 07:05:59,295 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ShuffleMapStage 27 (showString at NativeMethodAccessorImpl.java:0) 2024-06-12 07:05:59,295 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List() 2024-06-12 07:05:59,295 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:05:59,295 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ShuffleMapStage 27 (MapPartitionsRDD[52] at showString at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:05:59,300 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_22 stored as values in memory (estimated size 33.3 KiB, free 3.0 GiB) 2024-06-12 07:05:59,302 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_22_piece0 stored as bytes in memory (estimated size 14.6 KiB, free 3.0 GiB) 2024-06-12 07:05:59,302 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_22_piece0 in memory on vm-58f13156:42761 (size: 14.6 KiB, free: 3.0 GiB) 2024-06-12 07:05:59,302 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 22 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:05:59,303 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 17 missing tasks from ShuffleMapStage 27 (MapPartitionsRDD[52] at showString at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)) 2024-06-12 07:05:59,303 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 27.0 with 17 tasks resource profile 0 2024-06-12 07:05:59,304 INFO MemoryStore [Thread-47]: Block broadcast_23 stored as values in memory (estimated size 597.6 KiB, free 3.0 GiB) 2024-06-12 07:05:59,316 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_20_piece0 in memory on vm-58f13156:40101 (size: 7.5 KiB, free: 2.2 GiB) 2024-06-12 07:05:59,324 INFO MemoryStore [Thread-47]: Block broadcast_23_piece0 stored as bytes in memory (estimated size 55.1 KiB, free 3.0 GiB) 2024-06-12 07:05:59,325 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_23_piece0 in memory on vm-58f13156:42761 (size: 55.1 KiB, free: 3.0 GiB) 2024-06-12 07:05:59,325 INFO SparkContext [Thread-47]: Created broadcast 23 from showString at NativeMethodAccessorImpl.java:0 2024-06-12 07:05:59,326 INFO FileSourceScanExec [Thread-47]: Planning scan with bin packing, max size: 25285940 bytes, open cost is considered as scanning 4194304 bytes. 2024-06-12 07:05:59,328 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_19_piece0 in memory on vm-58f13156:40101 (size: 55.0 KiB, free: 2.2 GiB) 2024-06-12 07:05:59,334 INFO DAGScheduler [dag-scheduler-event-loop]: Registering RDD 56 (showString at NativeMethodAccessorImpl.java:0) as input to shuffle 8 2024-06-12 07:05:59,334 INFO DAGScheduler [dag-scheduler-event-loop]: Got map stage job 17 (showString at NativeMethodAccessorImpl.java:0) with 8 output partitions 2024-06-12 07:05:59,334 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ShuffleMapStage 28 (showString at NativeMethodAccessorImpl.java:0) 2024-06-12 07:05:59,334 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List() 2024-06-12 07:05:59,335 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:05:59,336 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ShuffleMapStage 28 (MapPartitionsRDD[56] at showString at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:05:59,340 INFO MemoryStore [Thread-47]: Block broadcast_24 stored as values in memory (estimated size 630.2 KiB, free 3.0 GiB) 2024-06-12 07:05:59,340 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_25 stored as values in memory (estimated size 33.3 KiB, free 3.0 GiB) 2024-06-12 07:05:59,342 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_25_piece0 stored as bytes in memory (estimated size 14.6 KiB, free 3.0 GiB) 2024-06-12 07:05:59,342 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_25_piece0 in memory on vm-58f13156:42761 (size: 14.6 KiB, free: 3.0 GiB) 2024-06-12 07:05:59,342 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 25 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:05:59,343 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 8 missing tasks from ShuffleMapStage 28 (MapPartitionsRDD[56] at showString at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7)) 2024-06-12 07:05:59,343 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 28.0 with 8 tasks resource profile 0 2024-06-12 07:05:59,366 INFO MemoryStore [Thread-47]: Block broadcast_24_piece0 stored as bytes in memory (estimated size 60.0 KiB, free 3.0 GiB) 2024-06-12 07:05:59,366 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_24_piece0 in memory on vm-58f13156:42761 (size: 60.0 KiB, free: 3.0 GiB) 2024-06-12 07:05:59,367 INFO SparkContext [Thread-47]: Created broadcast 24 from showString at NativeMethodAccessorImpl.java:0 2024-06-12 07:05:59,371 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_20_piece0 in memory on vm-14223739:44757 (size: 7.5 KiB, free: 2.2 GiB) 2024-06-12 07:05:59,372 INFO FileSourceScanExec [Thread-47]: Planning scan with bin packing, max size: 77510446 bytes, open cost is considered as scanning 4194304 bytes. 2024-06-12 07:05:59,381 INFO DAGScheduler [dag-scheduler-event-loop]: Registering RDD 60 (showString at NativeMethodAccessorImpl.java:0) as input to shuffle 9 2024-06-12 07:05:59,381 INFO DAGScheduler [dag-scheduler-event-loop]: Got map stage job 18 (showString at NativeMethodAccessorImpl.java:0) with 10 output partitions 2024-06-12 07:05:59,382 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ShuffleMapStage 29 (showString at NativeMethodAccessorImpl.java:0) 2024-06-12 07:05:59,382 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List() 2024-06-12 07:05:59,382 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:05:59,383 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ShuffleMapStage 29 (MapPartitionsRDD[60] at showString at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:05:59,389 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_19_piece0 in memory on vm-14223739:44757 (size: 55.0 KiB, free: 2.2 GiB) 2024-06-12 07:05:59,410 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_26 stored as values in memory (estimated size 63.6 KiB, free 3.0 GiB) 2024-06-12 07:05:59,413 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_26_piece0 stored as bytes in memory (estimated size 20.1 KiB, free 3.0 GiB) 2024-06-12 07:05:59,413 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_26_piece0 in memory on vm-58f13156:42761 (size: 20.1 KiB, free: 3.0 GiB) 2024-06-12 07:05:59,413 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 26 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:05:59,414 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 10 missing tasks from ShuffleMapStage 29 (MapPartitionsRDD[60] at showString at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)) 2024-06-12 07:05:59,414 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 29.0 with 10 tasks resource profile 0 2024-06-12 07:05:59,442 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 27.0 (TID 182) (vm-58f13156, executor 1, partition 0, PROCESS_LOCAL, 8354 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,443 INFO TaskSetManager [task-result-getter-3]: Finished task 0.0 in stage 26.0 (TID 174) in 208 ms on vm-58f13156 (executor 1) (1/8) 2024-06-12 07:05:59,458 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_22_piece0 in memory on vm-58f13156:40101 (size: 14.6 KiB, free: 2.2 GiB) 2024-06-12 07:05:59,461 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 1.0 in stage 27.0 (TID 183) (vm-58f13156, executor 1, partition 1, PROCESS_LOCAL, 8354 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,461 INFO TaskSetManager [task-result-getter-1]: Finished task 4.0 in stage 26.0 (TID 178) in 225 ms on vm-58f13156 (executor 1) (2/8) 2024-06-12 07:05:59,476 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_21_piece0 in memory on vm-58f13156:40101 (size: 55.1 KiB, free: 2.2 GiB) 2024-06-12 07:05:59,485 INFO CodeGenerator [Thread-47]: Code generated in 61.175472 ms 2024-06-12 07:05:59,489 INFO MemoryStore [Thread-47]: Block broadcast_27 stored as values in memory (estimated size 607.2 KiB, free 3.0 GiB) 2024-06-12 07:05:59,496 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 2.0 in stage 27.0 (TID 184) (vm-14223739, executor 2, partition 2, PROCESS_LOCAL, 8354 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,496 INFO TaskSetManager [task-result-getter-2]: Finished task 1.0 in stage 26.0 (TID 175) in 261 ms on vm-14223739 (executor 2) (3/8) 2024-06-12 07:05:59,497 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 3.0 in stage 27.0 (TID 185) (vm-14223739, executor 2, partition 3, PROCESS_LOCAL, 8354 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,498 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 4.0 in stage 27.0 (TID 186) (vm-14223739, executor 2, partition 4, PROCESS_LOCAL, 8354 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,500 INFO TaskSetManager [task-result-getter-0]: Finished task 7.0 in stage 26.0 (TID 181) in 264 ms on vm-14223739 (executor 2) (4/8) 2024-06-12 07:05:59,507 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 5.0 in stage 27.0 (TID 187) (vm-14223739, executor 2, partition 5, PROCESS_LOCAL, 8354 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,508 INFO TaskSetManager [task-result-getter-1]: Finished task 5.0 in stage 26.0 (TID 179) in 272 ms on vm-14223739 (executor 2) (5/8) 2024-06-12 07:05:59,508 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_22_piece0 in memory on vm-14223739:44757 (size: 14.6 KiB, free: 2.2 GiB) 2024-06-12 07:05:59,513 INFO MemoryStore [Thread-47]: Block broadcast_27_piece0 stored as bytes in memory (estimated size 56.9 KiB, free 3.0 GiB) 2024-06-12 07:05:59,513 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_27_piece0 in memory on vm-58f13156:42761 (size: 56.9 KiB, free: 3.0 GiB) 2024-06-12 07:05:59,514 INFO SparkContext [Thread-47]: Created broadcast 27 from showString at NativeMethodAccessorImpl.java:0 2024-06-12 07:05:59,518 INFO FileSourceScanExec [Thread-47]: Planning scan with bin packing, max size: 24367437 bytes, open cost is considered as scanning 4194304 bytes. 2024-06-12 07:05:59,519 INFO TaskSetManager [task-result-getter-3]: Finished task 3.0 in stage 26.0 (TID 177) in 284 ms on vm-14223739 (executor 2) (6/8) 2024-06-12 07:05:59,531 INFO DAGScheduler [dag-scheduler-event-loop]: Registering RDD 64 (showString at NativeMethodAccessorImpl.java:0) as input to shuffle 10 2024-06-12 07:05:59,531 INFO HashAggregateExec [Thread-47]: spark.sql.codegen.aggregate.map.twolevel.enabled is set to true, but current version of codegened fast hashmap does not support this aggregate. 2024-06-12 07:05:59,531 INFO DAGScheduler [dag-scheduler-event-loop]: Got map stage job 19 (showString at NativeMethodAccessorImpl.java:0) with 10 output partitions 2024-06-12 07:05:59,531 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ShuffleMapStage 30 (showString at NativeMethodAccessorImpl.java:0) 2024-06-12 07:05:59,531 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List() 2024-06-12 07:05:59,532 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:05:59,532 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ShuffleMapStage 30 (MapPartitionsRDD[64] at showString at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:05:59,536 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_28 stored as values in memory (estimated size 59.8 KiB, free 3.0 GiB) 2024-06-12 07:05:59,538 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_28_piece0 stored as bytes in memory (estimated size 14.0 KiB, free 3.0 GiB) 2024-06-12 07:05:59,539 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_28_piece0 in memory on vm-58f13156:42761 (size: 14.0 KiB, free: 3.0 GiB) 2024-06-12 07:05:59,539 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 28 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:05:59,541 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 10 missing tasks from ShuffleMapStage 30 (MapPartitionsRDD[64] at showString at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)) 2024-06-12 07:05:59,541 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 30.0 with 10 tasks resource profile 0 2024-06-12 07:05:59,542 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_21_piece0 in memory on vm-14223739:44757 (size: 55.1 KiB, free: 2.2 GiB) 2024-06-12 07:05:59,568 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 6.0 in stage 27.0 (TID 188) (vm-58f13156, executor 1, partition 6, PROCESS_LOCAL, 8354 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,568 INFO CodeGenerator [Thread-47]: Code generated in 23.39478 ms 2024-06-12 07:05:59,569 INFO TaskSetManager [task-result-getter-2]: Finished task 0.0 in stage 27.0 (TID 182) in 127 ms on vm-58f13156 (executor 1) (1/17) 2024-06-12 07:05:59,570 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 7.0 in stage 27.0 (TID 189) (vm-58f13156, executor 1, partition 7, PROCESS_LOCAL, 8354 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,571 INFO TaskSetManager [task-result-getter-0]: Finished task 1.0 in stage 27.0 (TID 183) in 111 ms on vm-58f13156 (executor 1) (2/17) 2024-06-12 07:05:59,572 INFO MemoryStore [Thread-47]: Block broadcast_29 stored as values in memory (estimated size 597.9 KiB, free 3.0 GiB) 2024-06-12 07:05:59,595 INFO MemoryStore [Thread-47]: Block broadcast_29_piece0 stored as bytes in memory (estimated size 55.4 KiB, free 3.0 GiB) 2024-06-12 07:05:59,596 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_29_piece0 in memory on vm-58f13156:42761 (size: 55.4 KiB, free: 3.0 GiB) 2024-06-12 07:05:59,597 INFO SparkContext [Thread-47]: Created broadcast 29 from showString at NativeMethodAccessorImpl.java:0 2024-06-12 07:05:59,597 INFO FileSourceScanExec [Thread-47]: Planning scan with bin packing, max size: 4194304 bytes, open cost is considered as scanning 4194304 bytes. 2024-06-12 07:05:59,609 INFO DAGScheduler [dag-scheduler-event-loop]: Registering RDD 68 (showString at NativeMethodAccessorImpl.java:0) as input to shuffle 11 2024-06-12 07:05:59,609 INFO DAGScheduler [dag-scheduler-event-loop]: Got map stage job 20 (showString at NativeMethodAccessorImpl.java:0) with 1 output partitions 2024-06-12 07:05:59,609 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ShuffleMapStage 31 (showString at NativeMethodAccessorImpl.java:0) 2024-06-12 07:05:59,609 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List() 2024-06-12 07:05:59,610 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:05:59,610 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ShuffleMapStage 31 (MapPartitionsRDD[68] at showString at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:05:59,613 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_30 stored as values in memory (estimated size 35.7 KiB, free 3.0 GiB) 2024-06-12 07:05:59,614 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_30_piece0 stored as bytes in memory (estimated size 15.2 KiB, free 3.0 GiB) 2024-06-12 07:05:59,614 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_30_piece0 in memory on vm-58f13156:42761 (size: 15.2 KiB, free: 3.0 GiB) 2024-06-12 07:05:59,615 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 30 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:05:59,615 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 1 missing tasks from ShuffleMapStage 31 (MapPartitionsRDD[68] at showString at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0)) 2024-06-12 07:05:59,615 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 31.0 with 1 tasks resource profile 0 2024-06-12 07:05:59,631 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 8.0 in stage 27.0 (TID 190) (vm-58f13156, executor 1, partition 8, PROCESS_LOCAL, 8354 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,632 INFO TaskSetManager [task-result-getter-1]: Finished task 7.0 in stage 27.0 (TID 189) in 62 ms on vm-58f13156 (executor 1) (3/17) 2024-06-12 07:05:59,638 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 9.0 in stage 27.0 (TID 191) (vm-58f13156, executor 1, partition 9, PROCESS_LOCAL, 8354 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,639 INFO TaskSetManager [task-result-getter-3]: Finished task 6.0 in stage 27.0 (TID 188) in 71 ms on vm-58f13156 (executor 1) (4/17) 2024-06-12 07:05:59,645 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 10.0 in stage 27.0 (TID 192) (vm-14223739, executor 2, partition 10, PROCESS_LOCAL, 11084 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,645 INFO TaskSetManager [task-result-getter-2]: Finished task 4.0 in stage 27.0 (TID 186) in 147 ms on vm-14223739 (executor 2) (5/17) 2024-06-12 07:05:59,646 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 11.0 in stage 27.0 (TID 193) (vm-14223739, executor 2, partition 11, PROCESS_LOCAL, 11751 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,646 INFO TaskSetManager [task-result-getter-0]: Finished task 3.0 in stage 27.0 (TID 185) in 149 ms on vm-14223739 (executor 2) (6/17) 2024-06-12 07:05:59,647 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 12.0 in stage 27.0 (TID 194) (vm-14223739, executor 2, partition 12, PROCESS_LOCAL, 11751 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,647 INFO TaskSetManager [task-result-getter-1]: Finished task 5.0 in stage 27.0 (TID 187) in 140 ms on vm-14223739 (executor 2) (7/17) 2024-06-12 07:05:59,648 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 13.0 in stage 27.0 (TID 195) (vm-14223739, executor 2, partition 13, PROCESS_LOCAL, 11961 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,653 INFO TaskSetManager [task-result-getter-3]: Finished task 2.0 in stage 27.0 (TID 184) in 152 ms on vm-14223739 (executor 2) (8/17) 2024-06-12 07:05:59,696 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 14.0 in stage 27.0 (TID 196) (vm-58f13156, executor 1, partition 14, PROCESS_LOCAL, 11961 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,697 INFO TaskSetManager [task-result-getter-2]: Finished task 8.0 in stage 27.0 (TID 190) in 66 ms on vm-58f13156 (executor 1) (9/17) 2024-06-12 07:05:59,701 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 15.0 in stage 27.0 (TID 197) (vm-58f13156, executor 1, partition 15, PROCESS_LOCAL, 12381 bytes) taskResourceAssignments Map() 2024-06-12 07:05:59,701 INFO TaskSetManager [task-result-getter-0]: Finished task 9.0 in stage 27.0 (TID 191) in 63 ms on vm-58f13156 (executor 1) (10/17) 2024-06-12 07:06:00,069 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 16.0 in stage 27.0 (TID 198) (vm-58f13156, executor 1, partition 16, PROCESS_LOCAL, 8391 bytes) taskResourceAssignments Map() 2024-06-12 07:06:00,069 INFO TaskSetManager [task-result-getter-1]: Finished task 6.0 in stage 26.0 (TID 180) in 833 ms on vm-58f13156 (executor 1) (7/8) 2024-06-12 07:06:00,138 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 28.0 (TID 199) (vm-58f13156, executor 1, partition 0, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:06:00,139 INFO TaskSetManager [task-result-getter-3]: Finished task 16.0 in stage 27.0 (TID 198) in 70 ms on vm-58f13156 (executor 1) (11/17) 2024-06-12 07:06:00,146 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_25_piece0 in memory on vm-58f13156:40101 (size: 14.6 KiB, free: 2.2 GiB) 2024-06-12 07:06:00,157 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_23_piece0 in memory on vm-58f13156:40101 (size: 55.1 KiB, free: 2.2 GiB) 2024-06-12 07:06:00,215 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 1.0 in stage 28.0 (TID 200) (vm-58f13156, executor 1, partition 1, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:06:00,215 INFO TaskSetManager [task-result-getter-2]: Finished task 0.0 in stage 28.0 (TID 199) in 77 ms on vm-58f13156 (executor 1) (1/8) 2024-06-12 07:06:00,278 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 2.0 in stage 28.0 (TID 201) (vm-58f13156, executor 1, partition 2, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:06:00,278 INFO TaskSetManager [task-result-getter-0]: Finished task 1.0 in stage 28.0 (TID 200) in 63 ms on vm-58f13156 (executor 1) (2/8) 2024-06-12 07:06:00,595 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 3.0 in stage 28.0 (TID 202) (vm-58f13156, executor 1, partition 3, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:06:00,595 INFO TaskSetManager [task-result-getter-1]: Finished task 2.0 in stage 26.0 (TID 176) in 1360 ms on vm-58f13156 (executor 1) (8/8) 2024-06-12 07:06:00,596 INFO YarnClusterScheduler [task-result-getter-1]: Removed TaskSet 26.0, whose tasks have all completed, from pool 2024-06-12 07:06:00,596 INFO DAGScheduler [dag-scheduler-event-loop]: ShuffleMapStage 26 (showString at NativeMethodAccessorImpl.java:0) finished in 1.370 s 2024-06-12 07:06:00,596 INFO DAGScheduler [dag-scheduler-event-loop]: looking for newly runnable stages 2024-06-12 07:06:00,596 INFO DAGScheduler [dag-scheduler-event-loop]: running: Set(ShuffleMapStage 30, ShuffleMapStage 27, ShuffleMapStage 31, ShuffleMapStage 28, ShuffleMapStage 29) 2024-06-12 07:06:00,596 INFO DAGScheduler [dag-scheduler-event-loop]: waiting: Set() 2024-06-12 07:06:00,596 INFO DAGScheduler [dag-scheduler-event-loop]: failed: Set() 2024-06-12 07:06:00,618 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:00,618 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:00,618 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for Vector(CustomerID#737) does not exist 2024-06-12 07:06:00,618 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use Vector(CustomerID#737), returning default shuffle keys 2024-06-12 07:06:00,618 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for Vector(CustomerID#760L) does not exist 2024-06-12 07:06:00,618 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use Vector(CustomerID#760L), returning default shuffle keys 2024-06-12 07:06:00,619 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(cast(CustomerID#737 as bigint)) does not exist 2024-06-12 07:06:00,619 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(cast(CustomerID#737 as bigint)), returning default shuffle keys 2024-06-12 07:06:00,619 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:00,619 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:00,619 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:00,619 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:00,619 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:00,619 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:00,619 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:00,619 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:00,619 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:00,619 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:00,619 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:00,619 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:00,620 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#760L) does not exist 2024-06-12 07:06:00,620 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#760L), returning default shuffle keys 2024-06-12 07:06:00,620 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for Vector(MarketingName#8, CatalogCategoryName#1) does not exist 2024-06-12 07:06:00,620 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use Vector(MarketingName#8, CatalogCategoryName#1), returning default shuffle keys 2024-06-12 07:06:00,652 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 4.0 in stage 28.0 (TID 203) (vm-58f13156, executor 1, partition 4, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:06:00,652 INFO TaskSetManager [task-result-getter-3]: Finished task 3.0 in stage 28.0 (TID 202) in 57 ms on vm-58f13156 (executor 1) (3/8) 2024-06-12 07:06:00,704 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 5.0 in stage 28.0 (TID 204) (vm-58f13156, executor 1, partition 5, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:06:00,704 INFO TaskSetManager [task-result-getter-2]: Finished task 4.0 in stage 28.0 (TID 203) in 53 ms on vm-58f13156 (executor 1) (4/8) 2024-06-12 07:06:00,754 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 6.0 in stage 28.0 (TID 205) (vm-58f13156, executor 1, partition 6, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:06:00,755 INFO TaskSetManager [task-result-getter-0]: Finished task 5.0 in stage 28.0 (TID 204) in 51 ms on vm-58f13156 (executor 1) (5/8) 2024-06-12 07:06:01,746 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 7.0 in stage 28.0 (TID 206) (vm-58f13156, executor 1, partition 7, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:06:01,746 INFO TaskSetManager [task-result-getter-1]: Finished task 6.0 in stage 28.0 (TID 205) in 992 ms on vm-58f13156 (executor 1) (6/8) 2024-06-12 07:06:01,772 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 29.0 (TID 207) (vm-58f13156, executor 1, partition 0, PROCESS_LOCAL, 8505 bytes) taskResourceAssignments Map() 2024-06-12 07:06:01,773 INFO TaskSetManager [task-result-getter-3]: Finished task 2.0 in stage 28.0 (TID 201) in 1494 ms on vm-58f13156 (executor 1) (7/8) 2024-06-12 07:06:01,779 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_26_piece0 in memory on vm-58f13156:40101 (size: 20.1 KiB, free: 2.2 GiB) 2024-06-12 07:06:01,816 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 1.0 in stage 29.0 (TID 208) (vm-58f13156, executor 1, partition 1, PROCESS_LOCAL, 8505 bytes) taskResourceAssignments Map() 2024-06-12 07:06:01,817 INFO TaskSetManager [task-result-getter-2]: Finished task 7.0 in stage 28.0 (TID 206) in 72 ms on vm-58f13156 (executor 1) (8/8) 2024-06-12 07:06:01,817 INFO YarnClusterScheduler [task-result-getter-2]: Removed TaskSet 28.0, whose tasks have all completed, from pool 2024-06-12 07:06:01,817 INFO DAGScheduler [dag-scheduler-event-loop]: ShuffleMapStage 28 (showString at NativeMethodAccessorImpl.java:0) finished in 2.480 s 2024-06-12 07:06:01,818 INFO DAGScheduler [dag-scheduler-event-loop]: looking for newly runnable stages 2024-06-12 07:06:01,818 INFO DAGScheduler [dag-scheduler-event-loop]: running: Set(ShuffleMapStage 30, ShuffleMapStage 27, ShuffleMapStage 31, ShuffleMapStage 29) 2024-06-12 07:06:01,818 INFO DAGScheduler [dag-scheduler-event-loop]: waiting: Set() 2024-06-12 07:06:01,818 INFO DAGScheduler [dag-scheduler-event-loop]: failed: Set() 2024-06-12 07:06:01,843 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:01,843 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:01,843 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(cast(CustomerID#737 as bigint)) does not exist 2024-06-12 07:06:01,843 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(cast(CustomerID#737 as bigint)), returning default shuffle keys 2024-06-12 07:06:01,843 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:01,844 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:01,844 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:01,844 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:01,844 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:01,844 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:01,844 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:01,844 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:01,844 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:01,844 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:01,845 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:01,845 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:01,845 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#760L) does not exist 2024-06-12 07:06:01,845 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#760L), returning default shuffle keys 2024-06-12 07:06:01,930 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_24_piece0 in memory on vm-58f13156:40101 (size: 60.0 KiB, free: 2.2 GiB) 2024-06-12 07:06:02,042 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 2.0 in stage 29.0 (TID 209) (vm-14223739, executor 2, partition 2, PROCESS_LOCAL, 8505 bytes) taskResourceAssignments Map() 2024-06-12 07:06:02,043 INFO TaskSetManager [task-result-getter-0]: Finished task 12.0 in stage 27.0 (TID 194) in 2396 ms on vm-14223739 (executor 2) (12/17) 2024-06-12 07:06:02,056 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_26_piece0 in memory on vm-14223739:44757 (size: 20.1 KiB, free: 2.2 GiB) 2024-06-12 07:06:02,162 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 3.0 in stage 29.0 (TID 210) (vm-58f13156, executor 1, partition 3, PROCESS_LOCAL, 8505 bytes) taskResourceAssignments Map() 2024-06-12 07:06:02,162 INFO TaskSetManager [task-result-getter-1]: Finished task 15.0 in stage 27.0 (TID 197) in 2462 ms on vm-58f13156 (executor 1) (13/17) 2024-06-12 07:06:02,218 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_24_piece0 in memory on vm-14223739:44757 (size: 60.0 KiB, free: 2.2 GiB) 2024-06-12 07:06:02,522 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 4.0 in stage 29.0 (TID 211) (vm-58f13156, executor 1, partition 4, PROCESS_LOCAL, 8505 bytes) taskResourceAssignments Map() 2024-06-12 07:06:02,523 INFO TaskSetManager [task-result-getter-3]: Finished task 14.0 in stage 27.0 (TID 196) in 2828 ms on vm-58f13156 (executor 1) (14/17) 2024-06-12 07:06:02,703 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 5.0 in stage 29.0 (TID 212) (vm-14223739, executor 2, partition 5, PROCESS_LOCAL, 8505 bytes) taskResourceAssignments Map() 2024-06-12 07:06:02,704 INFO TaskSetManager [task-result-getter-2]: Finished task 13.0 in stage 27.0 (TID 195) in 3056 ms on vm-14223739 (executor 2) (15/17) 2024-06-12 07:06:02,709 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 6.0 in stage 29.0 (TID 213) (vm-14223739, executor 2, partition 6, PROCESS_LOCAL, 8505 bytes) taskResourceAssignments Map() 2024-06-12 07:06:02,710 INFO TaskSetManager [task-result-getter-0]: Finished task 10.0 in stage 27.0 (TID 192) in 3066 ms on vm-14223739 (executor 2) (16/17) 2024-06-12 07:06:03,498 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 7.0 in stage 29.0 (TID 214) (vm-14223739, executor 2, partition 7, PROCESS_LOCAL, 8505 bytes) taskResourceAssignments Map() 2024-06-12 07:06:03,498 INFO TaskSetManager [task-result-getter-1]: Finished task 11.0 in stage 27.0 (TID 193) in 3852 ms on vm-14223739 (executor 2) (17/17) 2024-06-12 07:06:03,499 INFO YarnClusterScheduler [task-result-getter-1]: Removed TaskSet 27.0, whose tasks have all completed, from pool 2024-06-12 07:06:03,499 INFO DAGScheduler [dag-scheduler-event-loop]: ShuffleMapStage 27 (showString at NativeMethodAccessorImpl.java:0) finished in 4.202 s 2024-06-12 07:06:03,499 INFO DAGScheduler [dag-scheduler-event-loop]: looking for newly runnable stages 2024-06-12 07:06:03,500 INFO DAGScheduler [dag-scheduler-event-loop]: running: Set(ShuffleMapStage 30, ShuffleMapStage 31, ShuffleMapStage 29) 2024-06-12 07:06:03,500 INFO DAGScheduler [dag-scheduler-event-loop]: waiting: Set() 2024-06-12 07:06:03,500 INFO DAGScheduler [dag-scheduler-event-loop]: failed: Set() 2024-06-12 07:06:03,519 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:03,519 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:03,519 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(cast(CustomerID#737 as bigint)) does not exist 2024-06-12 07:06:03,519 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(cast(CustomerID#737 as bigint)), returning default shuffle keys 2024-06-12 07:06:03,519 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:03,519 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:03,520 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:03,520 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:03,520 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:03,520 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:03,520 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:03,520 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:03,520 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:03,520 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:03,520 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:03,520 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:03,521 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#760L) does not exist 2024-06-12 07:06:03,521 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#760L), returning default shuffle keys 2024-06-12 07:06:03,528 INFO ShufflePartitionsUtil [Thread-47]: For shuffle(7), advisory target size: 67108864, actual target size 2656857, minimum partition size: 1048576 2024-06-12 07:06:03,560 INFO HashAggregateExec [Thread-47]: spark.sql.codegen.aggregate.map.twolevel.enabled is set to true, but current version of codegened fast hashmap does not support this aggregate. 2024-06-12 07:06:03,582 INFO CodeGenerator [Thread-47]: Code generated in 19.137247 ms 2024-06-12 07:06:03,591 INFO DAGScheduler [dag-scheduler-event-loop]: Registering RDD 71 (showString at NativeMethodAccessorImpl.java:0) as input to shuffle 12 2024-06-12 07:06:03,591 INFO DAGScheduler [dag-scheduler-event-loop]: Got map stage job 21 (showString at NativeMethodAccessorImpl.java:0) with 8 output partitions 2024-06-12 07:06:03,592 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ShuffleMapStage 33 (showString at NativeMethodAccessorImpl.java:0) 2024-06-12 07:06:03,592 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List(ShuffleMapStage 32) 2024-06-12 07:06:03,592 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:06:03,592 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ShuffleMapStage 33 (MapPartitionsRDD[71] at showString at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:06:03,601 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_31 stored as values in memory (estimated size 36.4 KiB, free 3.0 GiB) 2024-06-12 07:06:03,602 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_31_piece0 stored as bytes in memory (estimated size 16.6 KiB, free 3.0 GiB) 2024-06-12 07:06:03,603 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_31_piece0 in memory on vm-58f13156:42761 (size: 16.6 KiB, free: 3.0 GiB) 2024-06-12 07:06:03,604 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 31 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:06:03,604 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 8 missing tasks from ShuffleMapStage 33 (MapPartitionsRDD[71] at showString at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7)) 2024-06-12 07:06:03,604 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 33.0 with 8 tasks resource profile 0 2024-06-12 07:06:08,512 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 8.0 in stage 29.0 (TID 215) (vm-58f13156, executor 1, partition 8, PROCESS_LOCAL, 8505 bytes) taskResourceAssignments Map() 2024-06-12 07:06:08,513 INFO TaskSetManager [task-result-getter-3]: Finished task 0.0 in stage 29.0 (TID 207) in 6742 ms on vm-58f13156 (executor 1) (1/10) 2024-06-12 07:06:08,513 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 9.0 in stage 29.0 (TID 216) (vm-58f13156, executor 1, partition 9, PROCESS_LOCAL, 8505 bytes) taskResourceAssignments Map() 2024-06-12 07:06:08,514 INFO TaskSetManager [task-result-getter-2]: Finished task 1.0 in stage 29.0 (TID 208) in 6697 ms on vm-58f13156 (executor 1) (2/10) 2024-06-12 07:06:08,516 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 30.0 (TID 217) (vm-58f13156, executor 1, partition 0, PROCESS_LOCAL, 8502 bytes) taskResourceAssignments Map() 2024-06-12 07:06:08,516 INFO TaskSetManager [task-result-getter-0]: Finished task 3.0 in stage 29.0 (TID 210) in 6355 ms on vm-58f13156 (executor 1) (3/10) 2024-06-12 07:06:08,523 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_28_piece0 in memory on vm-58f13156:40101 (size: 14.0 KiB, free: 2.2 GiB) 2024-06-12 07:06:08,577 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_27_piece0 in memory on vm-58f13156:40101 (size: 56.9 KiB, free: 2.2 GiB) 2024-06-12 07:06:08,625 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 1.0 in stage 30.0 (TID 218) (vm-58f13156, executor 1, partition 1, PROCESS_LOCAL, 8502 bytes) taskResourceAssignments Map() 2024-06-12 07:06:08,626 INFO TaskSetManager [task-result-getter-1]: Finished task 4.0 in stage 29.0 (TID 211) in 6104 ms on vm-58f13156 (executor 1) (4/10) 2024-06-12 07:06:09,964 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 2.0 in stage 30.0 (TID 219) (vm-14223739, executor 2, partition 2, PROCESS_LOCAL, 8502 bytes) taskResourceAssignments Map() 2024-06-12 07:06:09,965 INFO TaskSetManager [task-result-getter-3]: Finished task 2.0 in stage 29.0 (TID 209) in 7923 ms on vm-14223739 (executor 2) (5/10) 2024-06-12 07:06:09,981 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_28_piece0 in memory on vm-14223739:44757 (size: 14.0 KiB, free: 2.2 GiB) 2024-06-12 07:06:10,037 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 3.0 in stage 30.0 (TID 220) (vm-14223739, executor 2, partition 3, PROCESS_LOCAL, 8502 bytes) taskResourceAssignments Map() 2024-06-12 07:06:10,038 INFO TaskSetManager [task-result-getter-2]: Finished task 5.0 in stage 29.0 (TID 212) in 7335 ms on vm-14223739 (executor 2) (6/10) 2024-06-12 07:06:10,051 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_27_piece0 in memory on vm-14223739:44757 (size: 56.9 KiB, free: 2.2 GiB) 2024-06-12 07:06:10,140 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 4.0 in stage 30.0 (TID 221) (vm-14223739, executor 2, partition 4, PROCESS_LOCAL, 8502 bytes) taskResourceAssignments Map() 2024-06-12 07:06:10,141 INFO TaskSetManager [task-result-getter-0]: Finished task 6.0 in stage 29.0 (TID 213) in 7432 ms on vm-14223739 (executor 2) (7/10) 2024-06-12 07:06:10,941 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 5.0 in stage 30.0 (TID 222) (vm-14223739, executor 2, partition 5, PROCESS_LOCAL, 8502 bytes) taskResourceAssignments Map() 2024-06-12 07:06:10,942 INFO TaskSetManager [task-result-getter-1]: Finished task 7.0 in stage 29.0 (TID 214) in 7444 ms on vm-14223739 (executor 2) (8/10) 2024-06-12 07:06:12,560 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 6.0 in stage 30.0 (TID 223) (vm-58f13156, executor 1, partition 6, PROCESS_LOCAL, 8502 bytes) taskResourceAssignments Map() 2024-06-12 07:06:12,560 INFO TaskSetManager [task-result-getter-3]: Finished task 0.0 in stage 30.0 (TID 217) in 4044 ms on vm-58f13156 (executor 1) (1/10) 2024-06-12 07:06:12,568 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 7.0 in stage 30.0 (TID 224) (vm-58f13156, executor 1, partition 7, PROCESS_LOCAL, 8502 bytes) taskResourceAssignments Map() 2024-06-12 07:06:12,568 INFO TaskSetManager [task-result-getter-2]: Finished task 1.0 in stage 30.0 (TID 218) in 3943 ms on vm-58f13156 (executor 1) (2/10) 2024-06-12 07:06:14,554 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 8.0 in stage 30.0 (TID 225) (vm-58f13156, executor 1, partition 8, PROCESS_LOCAL, 8502 bytes) taskResourceAssignments Map() 2024-06-12 07:06:14,555 INFO TaskSetManager [task-result-getter-0]: Finished task 9.0 in stage 29.0 (TID 216) in 6041 ms on vm-58f13156 (executor 1) (9/10) 2024-06-12 07:06:14,571 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 9.0 in stage 30.0 (TID 226) (vm-58f13156, executor 1, partition 9, PROCESS_LOCAL, 8502 bytes) taskResourceAssignments Map() 2024-06-12 07:06:14,571 INFO TaskSetManager [task-result-getter-1]: Finished task 8.0 in stage 29.0 (TID 215) in 6059 ms on vm-58f13156 (executor 1) (10/10) 2024-06-12 07:06:14,572 INFO YarnClusterScheduler [task-result-getter-1]: Removed TaskSet 29.0, whose tasks have all completed, from pool 2024-06-12 07:06:14,572 INFO DAGScheduler [dag-scheduler-event-loop]: ShuffleMapStage 29 (showString at NativeMethodAccessorImpl.java:0) finished in 15.188 s 2024-06-12 07:06:14,573 INFO DAGScheduler [dag-scheduler-event-loop]: looking for newly runnable stages 2024-06-12 07:06:14,574 INFO DAGScheduler [dag-scheduler-event-loop]: running: Set(ShuffleMapStage 33, ShuffleMapStage 30, ShuffleMapStage 31) 2024-06-12 07:06:14,574 INFO DAGScheduler [dag-scheduler-event-loop]: waiting: Set() 2024-06-12 07:06:14,574 INFO DAGScheduler [dag-scheduler-event-loop]: failed: Set() 2024-06-12 07:06:14,596 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:14,596 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:14,596 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(cast(CustomerID#737 as bigint)) does not exist 2024-06-12 07:06:14,596 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(cast(CustomerID#737 as bigint)), returning default shuffle keys 2024-06-12 07:06:14,596 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(cast(CustomerID#737 as bigint)) does not exist 2024-06-12 07:06:14,596 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(cast(CustomerID#737 as bigint)), returning default shuffle keys 2024-06-12 07:06:14,597 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:14,597 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:14,597 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:14,597 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:14,597 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:14,597 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:14,597 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:14,597 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:14,597 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:14,597 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:14,597 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:14,597 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:14,598 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#760L) does not exist 2024-06-12 07:06:14,598 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#760L), returning default shuffle keys 2024-06-12 07:06:15,669 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 31.0 (TID 227) (vm-58f13156, executor 1, partition 0, PROCESS_LOCAL, 8325 bytes) taskResourceAssignments Map() 2024-06-12 07:06:15,669 INFO TaskSetManager [task-result-getter-3]: Finished task 6.0 in stage 30.0 (TID 223) in 3109 ms on vm-58f13156 (executor 1) (3/10) 2024-06-12 07:06:15,678 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_30_piece0 in memory on vm-58f13156:40101 (size: 15.2 KiB, free: 2.2 GiB) 2024-06-12 07:06:15,688 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 33.0 (TID 228) (vm-58f13156, executor 1, partition 0, NODE_LOCAL, 7809 bytes) taskResourceAssignments Map() 2024-06-12 07:06:15,688 INFO TaskSetManager [task-result-getter-2]: Finished task 7.0 in stage 30.0 (TID 224) in 3120 ms on vm-58f13156 (executor 1) (4/10) 2024-06-12 07:06:15,697 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_31_piece0 in memory on vm-58f13156:40101 (size: 16.6 KiB, free: 2.2 GiB) 2024-06-12 07:06:15,710 INFO MapOutputTrackerMasterEndpoint [dispatcher-event-loop-0]: Asked to send map output locations for shuffle 7 to 10.0.32.6:33208 2024-06-12 07:06:15,778 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_29_piece0 in memory on vm-58f13156:40101 (size: 55.4 KiB, free: 2.2 GiB) 2024-06-12 07:06:16,024 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 1.0 in stage 33.0 (TID 229) (vm-58f13156, executor 1, partition 1, NODE_LOCAL, 7809 bytes) taskResourceAssignments Map() 2024-06-12 07:06:16,024 INFO TaskSetManager [task-result-getter-0]: Finished task 8.0 in stage 30.0 (TID 225) in 1470 ms on vm-58f13156 (executor 1) (5/10) 2024-06-12 07:06:16,028 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 2.0 in stage 33.0 (TID 230) (vm-58f13156, executor 1, partition 2, NODE_LOCAL, 7809 bytes) taskResourceAssignments Map() 2024-06-12 07:06:16,029 INFO TaskSetManager [task-result-getter-1]: Finished task 0.0 in stage 31.0 (TID 227) in 360 ms on vm-58f13156 (executor 1) (1/1) 2024-06-12 07:06:16,030 INFO YarnClusterScheduler [task-result-getter-1]: Removed TaskSet 31.0, whose tasks have all completed, from pool 2024-06-12 07:06:16,030 INFO DAGScheduler [dag-scheduler-event-loop]: ShuffleMapStage 31 (showString at NativeMethodAccessorImpl.java:0) finished in 16.420 s 2024-06-12 07:06:16,030 INFO DAGScheduler [dag-scheduler-event-loop]: looking for newly runnable stages 2024-06-12 07:06:16,030 INFO DAGScheduler [dag-scheduler-event-loop]: running: Set(ShuffleMapStage 33, ShuffleMapStage 30) 2024-06-12 07:06:16,030 INFO DAGScheduler [dag-scheduler-event-loop]: waiting: Set() 2024-06-12 07:06:16,030 INFO DAGScheduler [dag-scheduler-event-loop]: failed: Set() 2024-06-12 07:06:16,054 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:16,054 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:16,054 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(cast(CustomerID#737 as bigint)) does not exist 2024-06-12 07:06:16,054 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(cast(CustomerID#737 as bigint)), returning default shuffle keys 2024-06-12 07:06:16,054 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(cast(CustomerID#737 as bigint)) does not exist 2024-06-12 07:06:16,054 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(cast(CustomerID#737 as bigint)), returning default shuffle keys 2024-06-12 07:06:16,055 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:16,055 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:16,055 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:16,055 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:16,055 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:16,055 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:16,055 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:16,055 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:16,055 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:16,055 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:16,056 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:16,056 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:16,056 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#760L) does not exist 2024-06-12 07:06:16,056 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#760L), returning default shuffle keys 2024-06-12 07:06:16,066 INFO ShufflePartitionsUtil [Thread-47]: For shuffle(11), advisory target size: 67108864, actual target size 1048576, minimum partition size: 1048576 2024-06-12 07:06:16,096 INFO HashAggregateExec [broadcast-exchange-1]: spark.sql.codegen.aggregate.map.twolevel.enabled is set to true, but current version of codegened fast hashmap does not support this aggregate. 2024-06-12 07:06:16,114 INFO CodeGenerator [broadcast-exchange-1]: Code generated in 14.846718 ms 2024-06-12 07:06:16,130 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 3.0 in stage 33.0 (TID 231) (vm-58f13156, executor 1, partition 3, NODE_LOCAL, 7809 bytes) taskResourceAssignments Map() 2024-06-12 07:06:16,131 INFO TaskSetManager [task-result-getter-3]: Finished task 9.0 in stage 30.0 (TID 226) in 1560 ms on vm-58f13156 (executor 1) (6/10) 2024-06-12 07:06:16,140 INFO SparkContext [broadcast-exchange-1]: Starting job: $anonfun$withThreadLocalCaptured$1 at FutureTask.java:266 2024-06-12 07:06:16,141 INFO DAGScheduler [dag-scheduler-event-loop]: Got job 22 ($anonfun$withThreadLocalCaptured$1 at FutureTask.java:266) with 1 output partitions 2024-06-12 07:06:16,141 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ResultStage 35 ($anonfun$withThreadLocalCaptured$1 at FutureTask.java:266) 2024-06-12 07:06:16,141 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List(ShuffleMapStage 34) 2024-06-12 07:06:16,142 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:06:16,142 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ResultStage 35 (MapPartitionsRDD[74] at $anonfun$withThreadLocalCaptured$1 at FutureTask.java:266), which has no missing parents 2024-06-12 07:06:16,143 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 4.0 in stage 33.0 (TID 232) (vm-58f13156, executor 1, partition 4, NODE_LOCAL, 7809 bytes) taskResourceAssignments Map() 2024-06-12 07:06:16,144 INFO TaskSetManager [task-result-getter-2]: Finished task 0.0 in stage 33.0 (TID 228) in 456 ms on vm-58f13156 (executor 1) (1/8) 2024-06-12 07:06:16,146 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_32 stored as values in memory (estimated size 37.6 KiB, free 3.0 GiB) 2024-06-12 07:06:16,148 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_32_piece0 stored as bytes in memory (estimated size 16.5 KiB, free 3.0 GiB) 2024-06-12 07:06:16,149 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_32_piece0 in memory on vm-58f13156:42761 (size: 16.5 KiB, free: 3.0 GiB) 2024-06-12 07:06:16,149 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 32 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:06:16,149 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 1 missing tasks from ResultStage 35 (MapPartitionsRDD[74] at $anonfun$withThreadLocalCaptured$1 at FutureTask.java:266) (first 15 tasks are for partitions Vector(0)) 2024-06-12 07:06:16,150 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 35.0 with 1 tasks resource profile 0 2024-06-12 07:06:16,368 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 5.0 in stage 33.0 (TID 233) (vm-58f13156, executor 1, partition 5, NODE_LOCAL, 7809 bytes) taskResourceAssignments Map() 2024-06-12 07:06:16,369 INFO TaskSetManager [task-result-getter-0]: Finished task 2.0 in stage 33.0 (TID 230) in 341 ms on vm-58f13156 (executor 1) (2/8) 2024-06-12 07:06:16,383 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 6.0 in stage 33.0 (TID 234) (vm-58f13156, executor 1, partition 6, NODE_LOCAL, 7809 bytes) taskResourceAssignments Map() 2024-06-12 07:06:16,384 INFO TaskSetManager [task-result-getter-1]: Finished task 1.0 in stage 33.0 (TID 229) in 360 ms on vm-58f13156 (executor 1) (3/8) 2024-06-12 07:06:16,449 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 7.0 in stage 33.0 (TID 235) (vm-58f13156, executor 1, partition 7, NODE_LOCAL, 7809 bytes) taskResourceAssignments Map() 2024-06-12 07:06:16,450 INFO TaskSetManager [task-result-getter-3]: Finished task 4.0 in stage 33.0 (TID 232) in 307 ms on vm-58f13156 (executor 1) (4/8) 2024-06-12 07:06:16,457 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 35.0 (TID 236) (vm-58f13156, executor 1, partition 0, NODE_LOCAL, 7820 bytes) taskResourceAssignments Map() 2024-06-12 07:06:16,457 INFO TaskSetManager [task-result-getter-2]: Finished task 3.0 in stage 33.0 (TID 231) in 327 ms on vm-58f13156 (executor 1) (5/8) 2024-06-12 07:06:16,464 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_32_piece0 in memory on vm-58f13156:40101 (size: 16.5 KiB, free: 2.2 GiB) 2024-06-12 07:06:16,468 INFO MapOutputTrackerMasterEndpoint [dispatcher-event-loop-1]: Asked to send map output locations for shuffle 11 to 10.0.32.6:33208 2024-06-12 07:06:16,490 INFO TaskSetManager [task-result-getter-0]: Finished task 0.0 in stage 35.0 (TID 236) in 33 ms on vm-58f13156 (executor 1) (1/1) 2024-06-12 07:06:16,490 INFO YarnClusterScheduler [task-result-getter-0]: Removed TaskSet 35.0, whose tasks have all completed, from pool 2024-06-12 07:06:16,491 INFO DAGScheduler [dag-scheduler-event-loop]: ResultStage 35 ($anonfun$withThreadLocalCaptured$1 at FutureTask.java:266) finished in 0.345 s 2024-06-12 07:06:16,491 INFO DAGScheduler [dag-scheduler-event-loop]: Job 22 is finished. Cancelling potential speculative or zombie tasks for this job 2024-06-12 07:06:16,491 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Killing all running tasks in stage 35: Stage finished 2024-06-12 07:06:16,491 INFO DAGScheduler [broadcast-exchange-1]: Job 22 finished: $anonfun$withThreadLocalCaptured$1 at FutureTask.java:266, took 0.350762 s 2024-06-12 07:06:16,496 INFO MemoryStore [broadcast-exchange-1]: Block broadcast_33_piece0 stored as bytes in memory (estimated size 9.5 KiB, free 3.0 GiB) 2024-06-12 07:06:16,496 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_33_piece0 in memory on vm-58f13156:42761 (size: 9.5 KiB, free: 3.0 GiB) 2024-06-12 07:06:16,497 INFO SparkContext [broadcast-exchange-1]: Created broadcast 33 from $anonfun$withThreadLocalCaptured$1 at FutureTask.java:266 2024-06-12 07:06:16,515 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:16,515 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:16,515 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(cast(CustomerID#737 as bigint)) does not exist 2024-06-12 07:06:16,515 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(cast(CustomerID#737 as bigint)), returning default shuffle keys 2024-06-12 07:06:16,515 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(cast(CustomerID#737 as bigint)) does not exist 2024-06-12 07:06:16,515 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(cast(CustomerID#737 as bigint)), returning default shuffle keys 2024-06-12 07:06:16,515 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:16,516 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:16,516 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:16,516 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:16,516 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:16,516 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:16,516 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:16,516 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:16,516 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:16,516 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:16,516 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:16,516 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:16,517 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#760L) does not exist 2024-06-12 07:06:16,517 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#760L), returning default shuffle keys 2024-06-12 07:06:16,669 INFO TaskSetManager [task-result-getter-1]: Finished task 6.0 in stage 33.0 (TID 234) in 286 ms on vm-58f13156 (executor 1) (6/8) 2024-06-12 07:06:16,696 INFO TaskSetManager [task-result-getter-3]: Finished task 5.0 in stage 33.0 (TID 233) in 329 ms on vm-58f13156 (executor 1) (7/8) 2024-06-12 07:06:16,789 INFO TaskSetManager [task-result-getter-2]: Finished task 7.0 in stage 33.0 (TID 235) in 340 ms on vm-58f13156 (executor 1) (8/8) 2024-06-12 07:06:16,789 INFO YarnClusterScheduler [task-result-getter-2]: Removed TaskSet 33.0, whose tasks have all completed, from pool 2024-06-12 07:06:16,790 INFO DAGScheduler [dag-scheduler-event-loop]: ShuffleMapStage 33 (showString at NativeMethodAccessorImpl.java:0) finished in 13.193 s 2024-06-12 07:06:16,790 INFO DAGScheduler [dag-scheduler-event-loop]: looking for newly runnable stages 2024-06-12 07:06:16,790 INFO DAGScheduler [dag-scheduler-event-loop]: running: Set(ShuffleMapStage 30) 2024-06-12 07:06:16,790 INFO DAGScheduler [dag-scheduler-event-loop]: waiting: Set() 2024-06-12 07:06:16,790 INFO DAGScheduler [dag-scheduler-event-loop]: failed: Set() 2024-06-12 07:06:16,809 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:16,809 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:16,809 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(cast(CustomerID#737 as bigint)) does not exist 2024-06-12 07:06:16,809 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(cast(CustomerID#737 as bigint)), returning default shuffle keys 2024-06-12 07:06:16,809 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(cast(CustomerID#737 as bigint)) does not exist 2024-06-12 07:06:16,809 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(cast(CustomerID#737 as bigint)), returning default shuffle keys 2024-06-12 07:06:16,810 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:16,810 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:16,810 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:16,810 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:16,810 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:16,810 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:16,810 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:16,810 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:16,810 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:16,810 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:16,810 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:16,810 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:16,810 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#760L) does not exist 2024-06-12 07:06:16,811 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#760L), returning default shuffle keys 2024-06-12 07:06:16,817 INFO ShufflePartitionsUtil [Thread-47]: For shuffle(12, 8), advisory target size: 67108864, actual target size 1048576, minimum partition size: 1048576 2024-06-12 07:06:16,873 INFO CodeGenerator [Thread-47]: Code generated in 19.796658 ms 2024-06-12 07:06:16,888 INFO CodeGenerator [Thread-47]: Code generated in 13.78441 ms 2024-06-12 07:06:16,894 INFO HashAggregateExec [Thread-47]: spark.sql.codegen.aggregate.map.twolevel.enabled is set to true, but current version of codegened fast hashmap does not support this aggregate. 2024-06-12 07:06:16,912 INFO CodeGenerator [Thread-47]: Code generated in 15.188421 ms 2024-06-12 07:06:16,921 INFO DAGScheduler [dag-scheduler-event-loop]: Registering RDD 81 (showString at NativeMethodAccessorImpl.java:0) as input to shuffle 13 2024-06-12 07:06:16,921 INFO DAGScheduler [dag-scheduler-event-loop]: Got map stage job 23 (showString at NativeMethodAccessorImpl.java:0) with 7 output partitions 2024-06-12 07:06:16,921 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ShuffleMapStage 39 (showString at NativeMethodAccessorImpl.java:0) 2024-06-12 07:06:16,921 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List(ShuffleMapStage 37, ShuffleMapStage 38) 2024-06-12 07:06:16,922 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:06:16,922 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ShuffleMapStage 39 (MapPartitionsRDD[81] at showString at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:06:16,927 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_34 stored as values in memory (estimated size 62.7 KiB, free 3.0 GiB) 2024-06-12 07:06:16,929 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_34_piece0 stored as bytes in memory (estimated size 24.6 KiB, free 3.0 GiB) 2024-06-12 07:06:16,929 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_34_piece0 in memory on vm-58f13156:42761 (size: 24.6 KiB, free: 3.0 GiB) 2024-06-12 07:06:16,929 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 34 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:06:16,930 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 7 missing tasks from ShuffleMapStage 39 (MapPartitionsRDD[81] at showString at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6)) 2024-06-12 07:06:16,930 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 39.0 with 7 tasks resource profile 0 2024-06-12 07:06:16,931 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 39.0 (TID 237) (vm-58f13156, executor 1, partition 0, NODE_LOCAL, 8091 bytes) taskResourceAssignments Map() 2024-06-12 07:06:16,932 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 1.0 in stage 39.0 (TID 238) (vm-58f13156, executor 1, partition 1, NODE_LOCAL, 8091 bytes) taskResourceAssignments Map() 2024-06-12 07:06:16,932 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 2.0 in stage 39.0 (TID 239) (vm-58f13156, executor 1, partition 2, NODE_LOCAL, 8091 bytes) taskResourceAssignments Map() 2024-06-12 07:06:16,932 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 3.0 in stage 39.0 (TID 240) (vm-58f13156, executor 1, partition 3, NODE_LOCAL, 8091 bytes) taskResourceAssignments Map() 2024-06-12 07:06:16,940 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_34_piece0 in memory on vm-58f13156:40101 (size: 24.6 KiB, free: 2.2 GiB) 2024-06-12 07:06:16,945 INFO MapOutputTrackerMasterEndpoint [dispatcher-event-loop-0]: Asked to send map output locations for shuffle 12 to 10.0.32.6:33208 2024-06-12 07:06:16,973 INFO MapOutputTrackerMasterEndpoint [dispatcher-event-loop-1]: Asked to send map output locations for shuffle 8 to 10.0.32.6:33208 2024-06-12 07:06:17,272 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 4.0 in stage 39.0 (TID 241) (vm-58f13156, executor 1, partition 4, NODE_LOCAL, 8091 bytes) taskResourceAssignments Map() 2024-06-12 07:06:17,273 INFO TaskSetManager [task-result-getter-0]: Finished task 1.0 in stage 39.0 (TID 238) in 341 ms on vm-58f13156 (executor 1) (1/7) 2024-06-12 07:06:17,284 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 5.0 in stage 39.0 (TID 242) (vm-58f13156, executor 1, partition 5, NODE_LOCAL, 8091 bytes) taskResourceAssignments Map() 2024-06-12 07:06:17,285 INFO TaskSetManager [task-result-getter-1]: Finished task 0.0 in stage 39.0 (TID 237) in 354 ms on vm-58f13156 (executor 1) (2/7) 2024-06-12 07:06:17,339 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 6.0 in stage 39.0 (TID 243) (vm-58f13156, executor 1, partition 6, NODE_LOCAL, 8091 bytes) taskResourceAssignments Map() 2024-06-12 07:06:17,339 INFO TaskSetManager [task-result-getter-3]: Finished task 3.0 in stage 39.0 (TID 240) in 407 ms on vm-58f13156 (executor 1) (3/7) 2024-06-12 07:06:17,341 INFO TaskSetManager [task-result-getter-2]: Finished task 2.0 in stage 39.0 (TID 239) in 409 ms on vm-58f13156 (executor 1) (4/7) 2024-06-12 07:06:17,444 INFO TaskSetManager [task-result-getter-0]: Finished task 5.0 in stage 39.0 (TID 242) in 160 ms on vm-58f13156 (executor 1) (5/7) 2024-06-12 07:06:17,463 INFO TaskSetManager [task-result-getter-1]: Finished task 4.0 in stage 39.0 (TID 241) in 191 ms on vm-58f13156 (executor 1) (6/7) 2024-06-12 07:06:17,538 INFO TaskSetManager [task-result-getter-3]: Finished task 6.0 in stage 39.0 (TID 243) in 199 ms on vm-58f13156 (executor 1) (7/7) 2024-06-12 07:06:17,538 INFO YarnClusterScheduler [task-result-getter-3]: Removed TaskSet 39.0, whose tasks have all completed, from pool 2024-06-12 07:06:17,539 INFO DAGScheduler [dag-scheduler-event-loop]: ShuffleMapStage 39 (showString at NativeMethodAccessorImpl.java:0) finished in 0.613 s 2024-06-12 07:06:17,539 INFO DAGScheduler [dag-scheduler-event-loop]: looking for newly runnable stages 2024-06-12 07:06:17,539 INFO DAGScheduler [dag-scheduler-event-loop]: running: Set(ShuffleMapStage 30) 2024-06-12 07:06:17,539 INFO DAGScheduler [dag-scheduler-event-loop]: waiting: Set() 2024-06-12 07:06:17,539 INFO DAGScheduler [dag-scheduler-event-loop]: failed: Set() 2024-06-12 07:06:17,553 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:17,553 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:17,553 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:17,553 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:17,553 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:17,553 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:17,553 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:17,553 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:17,554 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:17,554 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:17,558 INFO ShufflePartitionsUtil [Thread-47]: For shuffle(13), advisory target size: 67108864, actual target size 1048576, minimum partition size: 1048576 2024-06-12 07:06:17,588 INFO SparkContext [broadcast-exchange-2]: Starting job: $anonfun$withThreadLocalCaptured$1 at FutureTask.java:266 2024-06-12 07:06:17,590 INFO DAGScheduler [dag-scheduler-event-loop]: Got job 24 ($anonfun$withThreadLocalCaptured$1 at FutureTask.java:266) with 6 output partitions 2024-06-12 07:06:17,590 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ResultStage 44 ($anonfun$withThreadLocalCaptured$1 at FutureTask.java:266) 2024-06-12 07:06:17,590 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List(ShuffleMapStage 43) 2024-06-12 07:06:17,590 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:06:17,591 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ResultStage 44 (MapPartitionsRDD[83] at $anonfun$withThreadLocalCaptured$1 at FutureTask.java:266), which has no missing parents 2024-06-12 07:06:17,592 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_35 stored as values in memory (estimated size 7.4 KiB, free 3.0 GiB) 2024-06-12 07:06:17,593 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_35_piece0 stored as bytes in memory (estimated size 4.0 KiB, free 3.0 GiB) 2024-06-12 07:06:17,593 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_35_piece0 in memory on vm-58f13156:42761 (size: 4.0 KiB, free: 3.0 GiB) 2024-06-12 07:06:17,594 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 35 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:06:17,594 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 6 missing tasks from ResultStage 44 (MapPartitionsRDD[83] at $anonfun$withThreadLocalCaptured$1 at FutureTask.java:266) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5)) 2024-06-12 07:06:17,594 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 44.0 with 6 tasks resource profile 0 2024-06-12 07:06:17,595 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 44.0 (TID 244) (vm-58f13156, executor 1, partition 0, NODE_LOCAL, 7836 bytes) taskResourceAssignments Map() 2024-06-12 07:06:17,595 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 1.0 in stage 44.0 (TID 245) (vm-58f13156, executor 1, partition 1, NODE_LOCAL, 7836 bytes) taskResourceAssignments Map() 2024-06-12 07:06:17,596 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 2.0 in stage 44.0 (TID 246) (vm-58f13156, executor 1, partition 2, NODE_LOCAL, 7836 bytes) taskResourceAssignments Map() 2024-06-12 07:06:17,596 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 3.0 in stage 44.0 (TID 247) (vm-58f13156, executor 1, partition 3, NODE_LOCAL, 7836 bytes) taskResourceAssignments Map() 2024-06-12 07:06:17,601 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_35_piece0 in memory on vm-58f13156:40101 (size: 4.0 KiB, free: 2.2 GiB) 2024-06-12 07:06:17,604 INFO MapOutputTrackerMasterEndpoint [dispatcher-event-loop-0]: Asked to send map output locations for shuffle 13 to 10.0.32.6:33208 2024-06-12 07:06:17,672 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 4.0 in stage 44.0 (TID 248) (vm-58f13156, executor 1, partition 4, NODE_LOCAL, 7836 bytes) taskResourceAssignments Map() 2024-06-12 07:06:17,673 INFO TaskSetManager [task-result-getter-2]: Finished task 2.0 in stage 44.0 (TID 246) in 78 ms on vm-58f13156 (executor 1) (1/6) 2024-06-12 07:06:17,674 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 5.0 in stage 44.0 (TID 249) (vm-58f13156, executor 1, partition 5, NODE_LOCAL, 7836 bytes) taskResourceAssignments Map() 2024-06-12 07:06:17,674 INFO TaskSetManager [task-result-getter-0]: Finished task 1.0 in stage 44.0 (TID 245) in 79 ms on vm-58f13156 (executor 1) (2/6) 2024-06-12 07:06:17,676 INFO TaskSetManager [task-result-getter-1]: Finished task 3.0 in stage 44.0 (TID 247) in 80 ms on vm-58f13156 (executor 1) (3/6) 2024-06-12 07:06:17,707 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added taskresult_244 in memory on vm-58f13156:40101 (size: 1877.2 KiB, free: 2.2 GiB) 2024-06-12 07:06:17,716 INFO TaskSetManager [task-result-getter-3]: Finished task 0.0 in stage 44.0 (TID 244) in 121 ms on vm-58f13156 (executor 1) (4/6) 2024-06-12 07:06:17,718 INFO TaskSetManager [task-result-getter-2]: Finished task 4.0 in stage 44.0 (TID 248) in 46 ms on vm-58f13156 (executor 1) (5/6) 2024-06-12 07:06:17,720 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed taskresult_244 on vm-58f13156:40101 in memory (size: 1877.2 KiB, free: 2.2 GiB) 2024-06-12 07:06:17,735 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added taskresult_249 in memory on vm-58f13156:40101 (size: 1321.6 KiB, free: 2.2 GiB) 2024-06-12 07:06:17,743 INFO TaskSetManager [task-result-getter-0]: Finished task 5.0 in stage 44.0 (TID 249) in 70 ms on vm-58f13156 (executor 1) (6/6) 2024-06-12 07:06:17,744 INFO YarnClusterScheduler [task-result-getter-0]: Removed TaskSet 44.0, whose tasks have all completed, from pool 2024-06-12 07:06:17,744 INFO DAGScheduler [dag-scheduler-event-loop]: ResultStage 44 ($anonfun$withThreadLocalCaptured$1 at FutureTask.java:266) finished in 0.153 s 2024-06-12 07:06:17,744 INFO DAGScheduler [dag-scheduler-event-loop]: Job 24 is finished. Cancelling potential speculative or zombie tasks for this job 2024-06-12 07:06:17,744 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Killing all running tasks in stage 44: Stage finished 2024-06-12 07:06:17,745 INFO DAGScheduler [broadcast-exchange-2]: Job 24 finished: $anonfun$withThreadLocalCaptured$1 at FutureTask.java:266, took 0.155971 s 2024-06-12 07:06:17,747 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed taskresult_249 on vm-58f13156:40101 in memory (size: 1321.6 KiB, free: 2.2 GiB) 2024-06-12 07:06:18,265 INFO MemoryStore [broadcast-exchange-2]: Block broadcast_36_piece0 stored as bytes in memory (estimated size 4.0 MiB, free 3.0 GiB) 2024-06-12 07:06:18,266 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_36_piece0 in memory on vm-58f13156:42761 (size: 4.0 MiB, free: 3.0 GiB) 2024-06-12 07:06:18,268 INFO MemoryStore [broadcast-exchange-2]: Block broadcast_36_piece1 stored as bytes in memory (estimated size 4.0 MiB, free 3.0 GiB) 2024-06-12 07:06:18,268 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_36_piece1 in memory on vm-58f13156:42761 (size: 4.0 MiB, free: 3.0 GiB) 2024-06-12 07:06:18,270 INFO MemoryStore [broadcast-exchange-2]: Block broadcast_36_piece2 stored as bytes in memory (estimated size 4.0 MiB, free 3.0 GiB) 2024-06-12 07:06:18,271 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_36_piece2 in memory on vm-58f13156:42761 (size: 4.0 MiB, free: 3.0 GiB) 2024-06-12 07:06:18,273 INFO MemoryStore [broadcast-exchange-2]: Block broadcast_36_piece3 stored as bytes in memory (estimated size 4.0 MiB, free 3.0 GiB) 2024-06-12 07:06:18,273 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_36_piece3 in memory on vm-58f13156:42761 (size: 4.0 MiB, free: 3.0 GiB) 2024-06-12 07:06:18,273 INFO MemoryStore [broadcast-exchange-2]: Block broadcast_36_piece4 stored as bytes in memory (estimated size 593.6 KiB, free 3.0 GiB) 2024-06-12 07:06:18,274 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_36_piece4 in memory on vm-58f13156:42761 (size: 593.6 KiB, free: 3.0 GiB) 2024-06-12 07:06:18,274 INFO SparkContext [broadcast-exchange-2]: Created broadcast 36 from $anonfun$withThreadLocalCaptured$1 at FutureTask.java:266 2024-06-12 07:06:18,292 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:18,292 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:18,292 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:18,292 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:18,292 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:18,292 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:18,293 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:18,293 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:18,293 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:18,293 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:18,923 INFO DefaultsConfigSparkListener [spark-listener-group-shared]: Persisted __spark_conf_merge_records__.json 2024-06-12 07:06:18,929 INFO AsyncEventQueue [spark-listener-group-shared]: Process of event SparkListenerEnvironmentUpdate(Map(Spark Properties -> ArrayBuffer((spark.advise.nameToClass.DataSkew,com.microsoft.impulse.insights.DataSkewInsight), (spark.advise.nameToClass.DeltaSmallFileAdvise,org.apache.spark.advise.DeltaSmallFileAdvise), (spark.advise.nameToClass.DeltaSmallFileAutoOptimizeAdvise,org.apache.spark.advise.DeltaSmallFileAutoOptimizeAdvise), (spark.advise.nameToClass.DeltaZOrderAdvise,org.apache.spark.advise.DeltaZOrderAdvise), (spark.advise.nameToClass.DivisionExprAdvise,org.apache.spark.advise.DivisionExprAdvise), (spark.advise.nameToClass.DriverError,com.microsoft.impulse.insights.DriverErrorInsight), (spark.advise.nameToClass.ExecutorError,com.microsoft.impulse.insights.ExecutorErrorInsight), (spark.advise.nameToClass.FileBadRecordAdvise,org.apache.spark.advise.FileBadRecordAdvise), (spark.advise.nameToClass.HintNotRecognized,org.apache.spark.advise.HintNotRecognizedAdvise), (spark.advise.nameToClass.HintOverridden,org.apache.spark.advise.HintOverriddenAdvise), (spark.advise.nameToClass.HintRelationsNotFound,org.apache.spark.advise.HintRelationsNotFoundAdvise), (spark.advise.nameToClass.NonEqJoinAdvise,org.apache.spark.advise.NonEqJoinAdvise), (spark.advise.nameToClass.PercentilesMergeAdvise,org.apache.spark.advise.PercentilesMergeAdvise), (spark.advise.nameToClass.RandomSplitInconsistentAdvise,org.apache.spark.advise.RandomSplitInconsistentAdvise), (spark.advise.nameToClass.SparkStopAdvise,org.apache.spark.advise.SparkStopAdvise), (spark.advise.nameToClass.TaskError,com.microsoft.impulse.insights.TaskErrorInsight), (spark.advise.nameToClass.TimeSkew,com.microsoft.impulse.insights.TimeSkewInsight), (spark.advise.nameToClass.ViewAndTableNameCollision,org.apache.spark.advise.ViewAndTableNameCollisionAdvise), (spark.advisor.enabled,true), (spark.aml.obotoken,eyJhbGciOiJSUzI1NiIsImtpZCI6IjI0Nzc2OEE4Rjc2OUVGRUFFMjk1QzU5QTExNkU5NjA5MDNBOTBGMkYiLCJ0eXAiOiJKV1QifQ.eyJyb2xlIjoiQ29udHJpYnV0b3IiLCJzY29wZSI6Ii9zdWJzY3JpcHRpb25zLzE0ZTRjMWM5LTU0MzctNGViOC04ZGFkLTQ1Njk2NzA3YzcyOS9yZXNvdXJjZUdyb3Vwcy9kcy1yZXNvdXJjZXMvcHJvdmlkZXJzL01pY3Jvc29mdC5NYWNoaW5lTGVhcm5pbmdTZXJ2aWNlcy93b3Jrc3BhY2VzL2RzLXdvcmtzcGFjZS9leHBlcmltZW50TmFtZS91bWljby1kcy1sb3lhbHR5LWZlZWQvcnVuSWQvcmVkX3NlZWRfbjl2NGtqc20xNiIsImFjY291bnRpZCI6IjAwMDAwMDAwLTAwMDAtMDAwMC0wMDAwLTAwMDAwMDAwMDAwMCIsIndvcmtzcGFjZUlkIjoiMDAwMDAwMDAtMDAwMC0wMDAwLTAwMDAtMDAwMDAwMDAwMDAwIiwicHJvamVjdGlkIjoiMDAwMDAwMDAtMDAwMC0wMDAwLTAwMDAtMDAwMDAwMDAwMDAwIiwiZGlzY292ZXJ5IjoidXJpOi8vZGlzY292ZXJ5dXJpLyIsInRpZCI6ImNhYTk1MDY4LTdjYzEtNGM0MS05MjVmLTg3NWMyMmE1YzRjOSIsIm9pZCI6IjE4OTFhNTUwLTQ3MDQtNGE1MS05NTNkLWE1ZTMzNGI3OTRmNSIsInB1aWQiOiIxMDAzMjAwMzIxMTZBM0VEIiwiaXNzIjoiYXp1cmVtbCIsImV4cCI6MTcxOTkwMzcxNCwiYXVkIjoiYXp1cmVtbCJ9.CYO9tBokxw5fqbaOoLDK5A5kZQfG8iqhET8GGNxxJPAHizmKPm8skfsUYWa3ucdTKqK6L6np0rUzwrht49ctua2G0ynBXGKuytVpBXpYuAZY_XcJQkkUiNp25sOtfYVyCrOYr-BNDwjwrcXHIgcV-L5mCBqw-1sVQZy0YBxoCRFEjJEpf_3Jp58H8JZ00X8kcWSpxz1La3tFbYNEF4j1TVokaQlw_hjyJLb0mFx-2kPaPgmITC1ZEvzXuuxlJ93BjwPZyie-3dWKoADBpgWFFJ0T5DaEcyF_gnbbFu9MgXKG5Io9jY1LIVo7x-ElIOgX4a_TNt6UHmKG90zxV6WJEA), (spark.app.attempt.id,1), (spark.app.id,application_1718175835080_0001), (spark.app.name,Azure ML Experiment), (spark.app.startTime,1718175879632), (spark.app.submitTime,1718175853681), (spark.appLiveStatusPlugins,org.apache.spark.ui.EnhancementLiveStatusPlugin,org.apache.spark.diagnostic.synapse.SparkDiagnosticPlugin,org.apache.spark.deploy.history.rpc.app.RpcAppLivePlugin), (spark.arcadia.session.token,*********(redacted)), (spark.autoscale.executorResourceInfoTag.enabled,true), (spark.cluster.environment.name,Arcadia-Cluster-Service-PROD-WestEurope), (spark.cluster.environment.type,PROD), (spark.cluster.name,null), (spark.cluster.node.name,vm-58f13156), (spark.cluster.region,westeurope), (spark.cluster.type,aml), (spark.databricks.delta.vacuum.parallelDelete.enabled,true), (spark.decommission.enabled,true), (spark.delta.logStore.class,org.apache.spark.sql.delta.storage.AzureLogStore), (spark.dotnet.nuget.fallbackPackagesPath,/usr/local/lib/sparkdotnet/.nuget/packages), (spark.dotnet.packages,nuget:Microsoft.Spark,2.1.0-prerelease.22115.1;nuget:Microsoft.Spark.Extensions.DotNet.Interactive,2.1.0-prerelease.22115.1;nuget:Microsoft.Spark.Extensions.Delta,2.1.0-prerelease.22115.1;nuget:Microsoft.Spark.Extensions.Hyperspace,2.1.0-prerelease.22115.1;nuget:Microsoft.Spark.Extensions.Azure.Synapse.Analytics,0.15.0), (spark.dotnet.shell.command,/usr/share/dotnet-tools/dotnet-interactive,[synapse],stdio,--default-kernel,csharp), (spark.driver.cores,2), (spark.driver.extraClassPath,/usr/lib/library-manager/bin/libraries/scala/*:/usr/lib/dw-connector/synapse/*), (spark.driver.extraJavaOptions,-XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED -Dlog4j2.configurationFile=file:/usr/hdp/current/spark3-client/conf/driver-log4j2.properties -Detwlogger.component=sparkdriver -DlogFilter.filename=SparkLogFilters.xml -DpatternGroup.filename=SparkPatternGroups.xml -Dlog4jspark.root.logger=INFO,console,RFA,ETW,Anonymizer -Dlog4jspark.log.dir=/var/log/sparkapp/${user.name} -Dlog4jspark.log.file=sparkdriver.log -Djavax.xml.parsers.SAXParserFactory=com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl -XX:+UseParallelGC -XX:+UseParallelOldGC -XX:+TieredCompilation -XX:Tier4CompileThreshold=150000 -noverify), (spark.driver.extraLibraryPath,/usr/hdp/current/hadoop-client/lib/native), (spark.driver.host,vm-58f13156), (spark.driver.maxResultSize,4096m), (spark.driver.memory,6g), (spark.driver.memoryOverhead,384), (spark.driver.port,42075), (spark.dynamicAllocation.disableIfMinMaxNotSpecified.enabled,true), (spark.dynamicAllocation.shuffleTracking.enabled,true), (spark.eventLog.buffer.kb,4k), (spark.eventLog.dir,wasbs://9e2f8fd9-9d5f-4acd-99b5-3885490a4d31@hobostoragenue9ivxr1n.blob.core.windows.net/events/291/eventLogs/), (spark.eventLog.enabled,true), (spark.executor.cores,4), (spark.executor.extraClassPath,/usr/lib/library-manager/bin/libraries/scala/*:/usr/lib/dw-connector/synapse/*), (spark.executor.extraJavaOptions,-XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED -Dlog4j2.configurationFile=file:/usr/hdp/current/spark3-client/conf/executor-log4j2.properties -Djavax.xml.parsers.SAXParserFactory=com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl -XX:+UseG1GC), (spark.executor.extraLibraryPath,/usr/hdp/current/hadoop-client/lib/native), (spark.executor.id,driver), (spark.executor.instances,2), (spark.executor.memory,4g), (spark.executor.memoryOverhead,384), (spark.executorEnv.AZUREML_ARM_PROJECT_NAME,umico-ds-loyalty-feed), (spark.executorEnv.AZUREML_ARM_RESOURCEGROUP,ds-resources), (spark.executorEnv.AZUREML_ARM_SUBSCRIPTION,14e4c1c9-5437-4eb8-8dad-45696707c729), (spark.executorEnv.AZUREML_ARM_WORKSPACE_NAME,ds-workspace), (spark.executorEnv.AZUREML_DATAPREP_TOKEN_PROVIDER,sparkobo), (spark.executorEnv.AZUREML_EXPERIMENT_ID,63a713bb-3fa2-48af-a358-bc0522d4b3e1), (spark.executorEnv.AZUREML_OBO_CANARY_TOKEN,eyJhbGciOiJSUzI1NiIsImtpZCI6IjI0Nzc2OEE4Rjc2OUVGRUFFMjk1QzU5QTExNkU5NjA5MDNBOTBGMkYiLCJ0eXAiOiJKV1QifQ.eyJyb2xlIjoiQ29udHJpYnV0b3IiLCJzY29wZSI6Ii9zdWJzY3JpcHRpb25zLzE0ZTRjMWM5LTU0MzctNGViOC04ZGFkLTQ1Njk2NzA3YzcyOS9yZXNvdXJjZUdyb3Vwcy9kcy1yZXNvdXJjZXMvcHJvdmlkZXJzL01pY3Jvc29mdC5NYWNoaW5lTGVhcm5pbmdTZXJ2aWNlcy93b3Jrc3BhY2VzL2RzLXdvcmtzcGFjZS9leHBlcmltZW50TmFtZS91bWljby1kcy1sb3lhbHR5LWZlZWQvcnVuSWQvcmVkX3NlZWRfbjl2NGtqc20xNiIsImFjY291bnRpZCI6IjAwMDAwMDAwLTAwMDAtMDAwMC0wMDAwLTAwMDAwMDAwMDAwMCIsIndvcmtzcGFjZUlkIjoiMDAwMDAwMDAtMDAwMC0wMDAwLTAwMDAtMDAwMDAwMDAwMDAwIiwicHJvamVjdGlkIjoiMDAwMDAwMDAtMDAwMC0wMDAwLTAwMDAtMDAwMDAwMDAwMDAwIiwiZGlzY292ZXJ5IjoidXJpOi8vZGlzY292ZXJ5dXJpLyIsInRpZCI6ImNhYTk1MDY4LTdjYzEtNGM0MS05MjVmLTg3NWMyMmE1YzRjOSIsIm9pZCI6IjE4OTFhNTUwLTQ3MDQtNGE1MS05NTNkLWE1ZTMzNGI3OTRmNSIsInB1aWQiOiIxMDAzMjAwMzIxMTZBM0VEIiwiaXNzIjoiYXp1cmVtbCIsImV4cCI6MTcxOTkwMzcxNCwiYXVkIjoiYXp1cmVtbCJ9.CYO9tBokxw5fqbaOoLDK5A5kZQfG8iqhET8GGNxxJPAHizmKPm8skfsUYWa3ucdTKqK6L6np0rUzwrht49ctua2G0ynBXGKuytVpBXpYuAZY_XcJQkkUiNp25sOtfYVyCrOYr-BNDwjwrcXHIgcV-L5mCBqw-1sVQZy0YBxoCRFEjJEpf_3Jp58H8JZ00X8kcWSpxz1La3tFbYNEF4j1TVokaQlw_hjyJLb0mFx-2kPaPgmITC1ZEvzXuuxlJ93BjwPZyie-3dWKoADBpgWFFJ0T5DaEcyF_gnbbFu9MgXKG5Io9jY1LIVo7x-ElIOgX4a_TNt6UHmKG90zxV6WJEA), (spark.executorEnv.AZUREML_OBO_SERVICE_ENDPOINT,https://westeurope.api.azureml.ms), (spark.executorEnv.AZUREML_OBO_USER_TOKEN_FOR_SPARK_RETRIEVAL_API,getuseraccesstokenforspark), (spark.executorEnv.AZUREML_RUN_ID,red_seed_n9v4kjsm16), (spark.executorEnv.AZUREML_RUN_TOKEN,eyJhbGciOiJSUzI1NiIsImtpZCI6IjI0Nzc2OEE4Rjc2OUVGRUFFMjk1QzU5QTExNkU5NjA5MDNBOTBGMkYiLCJ0eXAiOiJKV1QifQ.eyJyb2xlIjoiQ29udHJpYnV0b3IiLCJzY29wZSI6Ii9zdWJzY3JpcHRpb25zLzE0ZTRjMWM5LTU0MzctNGViOC04ZGFkLTQ1Njk2NzA3YzcyOS9yZXNvdXJjZUdyb3Vwcy9kcy1yZXNvdXJjZXMvcHJvdmlkZXJzL01pY3Jvc29mdC5NYWNoaW5lTGVhcm5pbmdTZXJ2aWNlcy93b3Jrc3BhY2VzL2RzLXdvcmtzcGFjZSIsImFjY291bnRpZCI6IjAwMDAwMDAwLTAwMDAtMDAwMC0wMDAwLTAwMDAwMDAwMDAwMCIsIndvcmtzcGFjZUlkIjoiMjVkODI3ZjMtY2YxMC00ZTJhLWI2NWQtNDBjMzE2ODEyZGRkIiwicHJvamVjdGlkIjoiMDAwMDAwMDAtMDAwMC0wMDAwLTAwMDAtMDAwMDAwMDAwMDAwIiwiZGlzY292ZXJ5IjoidXJpOi8vZGlzY292ZXJ5dXJpLyIsInRpZCI6ImNhYTk1MDY4LTdjYzEtNGM0MS05MjVmLTg3NWMyMmE1YzRjOSIsIm9pZCI6IjE4OTFhNTUwLTQ3MDQtNGE1MS05NTNkLWE1ZTMzNGI3OTRmNSIsInB1aWQiOiIxMDAzMjAwMzIxMTZBM0VEIiwiaXNzIjoiYXp1cmVtbCIsImFwcGlkIjoiQmFraHJ1eiBEemhhZmFyb3YiLCJleHAiOjE3MTk5OTczMTQsImF1ZCI6ImF6dXJlbWwifQ.qMiL7T_ZZwGiGLKd9yOQLAI9KvI-W4_xIDoPsl42rL6Gu1uJGlbD5XxOAcfoVzFnKAH7tooTKsVvWUQ4xV9EoHoEwlIQo3psQKnL2QRZ4mxB8i6vEOU8vEu0oVwMvPwoEP3fcw1cPCSrPOSYkexvIqFoQj5HUtbQibEmcFQUhRgDW1G0pT5UWO_uyuro_pR5enTxwNS3F6MPQKHTXgBGl_4nyM1K21yLEvp2P7jUfz7C3sG_rdcAeWA4IZWAYWZUfOO2Zi2d9Id7WnX8yOekNldT1aUFBUwajrkFZmR5qpzqGl0VkmMcGNz2cJCxydbUiIXqG3zU3MHh-oPnj5giKw), (spark.executorEnv.AZUREML_RUN_TOKEN_EXPIRY,1719997314), (spark.executorEnv.AZUREML_SERVICE_CERT_ENDPOINT,https://westeurope.cert.api.azureml.ms), (spark.executorEnv.AZUREML_SERVICE_ENDPOINT,https://westeurope.api.azureml.ms), (spark.executorEnv.AZUREML_WORKSPACE_ID,25d827f3-cf10-4e2a-b65d-40c316812ddd), (spark.executorEnv.JAVA_TOOL_OPTIONS,-Djdk.jar.maxSignatureFileSize=2147483639), (spark.executorEnv.OID,1891a550-4704-4a51-953d-a5e334b794f5), (spark.executorEnv.PATH,/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin:/usr/local/cuda-11.5/bin:/home/trusted-service-user/cluster-env/env/bin:/home/trusted-service-user/cluster-env/synapse_trident_r/bin), (spark.executorEnv.PYTHONPATH,/opt/spark/python/lib/pyspark.zip/opt/spark/python/lib/py4j-0.10.7-src.zip{{PWD}}/source.zip{{PWD}}/setup.zip), (spark.executorEnv.SPARKR_INLINE_SESSION_LEVEL_ENABLE,true), (spark.executorEnv.SPARK_HOME,/opt/spark), (spark.executorEnv.TID,caa95068-7cc1-4c41-925f-875c22a5c4c9), (spark.extraListeners,com.microsoft.hdinsight.spark.metrics.SparkMetricsListener,org.apache.spark.listeners.LogAnalyticsSparkListener,com.microsoft.impulse.analyze.eventLog.ImpulseListener,org.apache.spark.advise.input.MetricsServiceListener), (spark.hadoop.fs.AbstractFileSystem.azureml.impl,org.apache.hadoop.fs.azureml.Azureml), (spark.hadoop.fs.adl.oauth2.access.token.provider,com.microsoft.azure.synapse.tokenlibrary.AzureMLTokenBasedTokenProviderGen1), (spark.hadoop.fs.azure.account.oauth.provider.type,com.microsoft.azure.synapse.tokenlibrary.AzureMLTokenBasedTokenProviderGen2), (spark.hadoop.fs.azure.block.blob.with.compaction.dir,wasbs://9e2f8fd9-9d5f-4acd-99b5-3885490a4d31@hobostoragenue9ivxr1n.blob.core.windows.net/events/291/eventLogs/), (spark.hadoop.fs.azure.client.correlationid ,ffa29c45-bec0-4198-9c33-2915e199aa93), (spark.hadoop.fs.azure.sas.azureml-blobstore-25d827f3-cf10-4e2a-b65d-40c316812ddd.dsmlstoragexfsyt.blob.core.windows.net,*********(redacted)), (spark.hadoop.fs.azureml.impl,org.apache.hadoop.fs.azureml.AzureMLFileSystem), (spark.hadoop.fs.azureml.impl.disable.cache,true), (spark.hadoop.javax.jdo.option.ConnectionDriverName,com.microsoft.sqlserver.jdbc.SQLServerDriver), (spark.hadoop.javax.jdo.option.ConnectionPassword,*********(redacted)), (spark.hadoop.javax.jdo.option.ConnectionURL,;), (spark.hadoop.javax.jdo.option.ConnectionUserName,), (spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version,2), (spark.hadoop.synapse.vfs.acceptedThreadNames,Executor task launch), (spark.hadoop.synapse.vfs.debug.log.level,3), (spark.hadoop.synapse.vfs.disabled.extensions,.csv), (spark.hadoop.synapse.vfs.enabled,true), (spark.hadoop.synapse.vfs.enabled.extensions,.parquet), (spark.history.fs.cleaner.enabled,false), (spark.history.fs.cleaner.interval,30d), (spark.history.store.path,/var/lib/spark3/shs_db), (spark.history.ui.port,18080), (spark.inputOutput.data.enabled,true), (spark.io.compression.lz4.blockSize,128kb), (spark.jars.ivy.lockStrategy,artifact-lock), (spark.jars.ivy.retrieve.cleanup,true), (spark.jars.ivy.retrieve.symlink,true), (spark.jobGroup.sourceMapping.enabled,true), (spark.jobGroup.usageDescription.enable,true), (spark.kryo.registrator,com.microsoft.spark.sqlanalytics.utils.MyKryoRegistrator), (spark.kryoserializer.buffer.max,128m), (spark.lighter.server.plugin,org.apache.spark.lighter.DefaultLighterServerPlugin), (spark.livy.pipeInteractiveConsoleBacktoSparkConsole.enabled,true), (spark.livy.pipeInteractiveConsoleRemovePruRunMarker.enabled,true), (spark.livy.session.type,batch), (spark.livy.synapse.cancelImprovement.enabled,true), (spark.livy.synapse.skipSplitCodeExecution.enabled,true), (spark.locality.wait,1), (spark.master,yarn), (spark.microsoft.delta.merge.lowShuffle.enabled,true), (spark.microsoft.delta.optimizeWrite.partitioned.enabled,true), (spark.mlflow.pysparkml.autolog.logModelAllowlistFile,https://mmlspark.blob.core.windows.net/publicwasb/log_model_allowlist.txt), (spark.nonjvm.error.buffer.size,208000), (spark.nonjvm.error.forwarding.enabled,True), (spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_HOSTS,vm-58f13156,vm-b0b35166), (spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_URI_BASES,http://vm-58f13156:8088/proxy/application_1718175835080_0001,http://vm-b0b35166:8088/proxy/application_1718175835080_0001), (spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.RM_HA_URLS,vm-58f13156:8088,vm-b0b35166:8088), (spark.pythonRunnerOutputStream.plugin,org.apache.spark.microsoft.tools.api.plugin.MSToolsPythonRunnerOutputStreamPlugin), (spark.r.shell.command,/home/trusted-service-user/R/source/RWrapper.sh), (spark.rapids.sql.concurrentGpuTasks,2), (spark.rapids.sql.explain,NOT_ON_GPU), (spark.rdd.compress,true), (spark.redaction.regex,*********(redacted)), (spark.reset.appName.enabled,true), (spark.scheduler.listenerbus.eventqueue.shared.timeout,10000), (spark.scheduler.listenerbus.eventqueue.sparkRpcHistoryServer.timeout,10000), (spark.scheduler.minRegisteredResourcesRatio,0.0), (spark.scheduler.mode,FIFO), (spark.serializer,org.apache.spark.serializer.KryoSerializer), (spark.serializer.objectStreamReset,100), (spark.shuffle.file.buffer,1m), (spark.shuffle.io.backLog,8192), (spark.shuffle.io.serverThreads,128), (spark.shuffle.service.client.class,org.apache.spark.network.shuffle.ShuffleMovementAwareExternalBlockStoreClient), (spark.shuffle.service.enabled,true), (spark.shuffle.unsafe.file.output.buffer,5m), (spark.sparkContextAfterInit.plugins,org.apache.spark.microsoft.tools.api.plugin.MSToolsSparkContextAfterInitPlugin), (spark.sparkr.r.command,/home/trusted-service-user/R/source/RWrapper.sh), (spark.sql.autoBroadcastJoinThreshold,26214400), (spark.sql.bnlj.codegen.enabled,true), (spark.sql.cardinalityEstimation.enabled,true), (spark.sql.catalog.pbi,com.microsoft.azure.synapse.ml.powerbi.PowerBICatalog), (spark.sql.catalog.spark_catalog,org.apache.spark.sql.delta.catalog.DeltaCatalog), (spark.sql.catalogImplementation,hive), (spark.sql.cbo.enabled,true), (spark.sql.cbo.joinReorder.enabled,true), (spark.sql.convertInnerJoinToLeftSemiJoin,true), (spark.sql.crossJoin.enabled,true), (spark.sql.decimalDivision.optimizationEnabled,true), (spark.sql.dpp.size.estimate,true), (spark.sql.exchange.reuse.correction.enabled,true), (spark.sql.execution.arrow.pyspark.enabled,true), (spark.sql.execution.arrow.pyspark.fallback.enabled,true), (spark.sql.execution.collapseAggregateNodes,true), (spark.sql.execution.pyspark.udf.simplifiedTraceback.enabled,true), (spark.sql.extensions,com.microsoft.vegas.common.VegasExtensionBuilder,com.microsoft.peregrine.spark.extensions.SparkExtensionsSynapse,io.delta.sql.DeltaSparkSessionExtension,com.microsoft.azure.synapse.ml.predict.SynapsePredictExtensions), (spark.sql.files.maxPartitionBytes,134217728), (spark.sql.hint.error.handler,org.apache.spark.sql.advise.HintErrorAdvisorHandler), (spark.sql.hive.convertMetastoreOrc,true), (spark.sql.hive.metastore.jars,/opt/hive-metastore/lib/*), (spark.sql.hive.metastore.version,2.3.2), (spark.sql.joinConditionReorder.enabled,true), (spark.sql.legacy.replaceDatabricksSparkAvro.enabled,false), (spark.sql.local.window.optimization.enabled,true), (spark.sql.normalize.aggregate.enabled,false), (spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly,false), (spark.sql.optimizer.runtime.bloomFilter.enabled,true), (spark.sql.orc.filterPushdown,true), (spark.sql.orc.impl,native), (spark.sql.parquet.footerCache.size,1000), (spark.sql.preaggregation.enabled,true), (spark.sql.preaggregation.partition.key.based.stats.enabled,true), (spark.sql.preaggregation.pushdown.below.union.enabled,true), (spark.sql.pruneFileSourcePartitions.enableStats,true), (spark.sql.pushdown.project.below.expand.enabled,true), (spark.sql.sizeBasedJoinReorder.enabled,true), (spark.sql.smart.shuffle.enabled,true), (spark.sql.sources.parallelPartitionDiscovery.parallelism,200), (spark.sql.spark.cluster.type,aml), (spark.sql.statistics.fallBackToHdfs,true), (spark.sql.use.codegen.for.window.functions,true), (spark.sql.use.rollup.aggregate,true), (spark.sql.warehouse.dir,wasbs://9e2f8fd9-9d5f-4acd-99b5-3885490a4d31@hobostoragenue9ivxr1n.blob.core.windows.net/synapse/workspaces/25d827f3-cf10-4e2a-b65d-40c316812ddd/warehouse), (spark.sql.window.sort.optimization.enabled,true), (spark.stop.improvement.enabled,true), (spark.storage.decommission.enabled,true), (spark.storage.decommission.notifyExternalShuffleService,true), (spark.storage.decommission.rddBlocks.enabled,false), (spark.storage.decommission.shuffleBlocks.enabled,false), (spark.submit.deployMode,cluster), (spark.submit.pyFiles,wasbs://azureml-blobstore-25d827f3-cf10-4e2a-b65d-40c316812ddd@dsmlstoragexfsyt.blob.core.windows.net/azureml/red_seed_n9v4kjsm16/source.zip,wasbs://azureml-blobstore-25d827f3-cf10-4e2a-b65d-40c316812ddd@dsmlstoragexfsyt.blob.core.windows.net/azureml/red_seed_n9v4kjsm16/setup.zip), (spark.synapse.clusteridentifier,13895c74-bfb6-434f-97d3-556cec4e06ef), (spark.synapse.customercorrelationid,ffa29c45-bec0-4198-9c33-2915e199aa93), (spark.synapse.dep.enabled,false), (spark.synapse.diagnostic.builtinEmitters,ShoeboxEmitter), (spark.synapse.diagnostic.emitter.ShoeboxEmitter.proxyServiceIp,), (spark.synapse.diagnostic.emitter.ShoeboxEmitter.type,Shoebox), (spark.synapse.gatewayHost,dev.azuresynapse.net), (spark.synapse.history.rpc.batch.size,2000), (spark.synapse.history.rpc.message.maxSize,10485760), (spark.synapse.history.rpc.port,18082), (spark.synapse.history.rpc.sparkContext.enabled,true), (spark.synapse.history.rpc.update.delayMs,2000), (spark.synapse.history.rpc.update.intervalMs,1000), (spark.synapse.history.rpc.update.retry.maxNumber,3), (spark.synapse.history.rpc.update.retry.waitMs,5000), (spark.synapse.history.rpc.update.timeoutMs,5000), (spark.synapse.history.rpc.waitAppStart.enabled,true), (spark.synapse.jobidentifier,25d827f3-cf10-4e2a-b65d-40c316812ddd.2954a676-18e8-4c10-af9c-144cc60a03c5.291), (spark.synapse.ml.predict.enabled,true), (spark.synapse.pool.name,2954a676-18e8-4c10-af9c-144cc60a03c5), (spark.synapse.rpc.listener.historyServer.address,${hadoopconf-yarn.resourcemanager.hostname.rm1}), (spark.synapse.rpc.listener.nodeInfo.enabled,true), (spark.synapse.rpc.listener.nodeInfo.path,/etc/bbc/nodes.json), (spark.synapse.studioHost,web.azuresynapse.net), (spark.synapse.vegas.EnableProgressiveDownload,true), (spark.synapse.vegas.cacheSize,0), (spark.synapse.vegas.consistent.hash,true), (spark.synapse.vegas.hash.placement,true), (spark.synapse.vegas.useCache,true), (spark.synapse.vhd.id,e64e70d5-39c9-4075-81b0-fffe6d7baf60), (spark.synapse.vhd.name,fc066110ee6fe8.vhd), (spark.synapse.workspace.name,25d827f3-cf10-4e2a-b65d-40c316812ddd), (spark.synapse.workspace.tenantId,caa95068-7cc1-4c41-925f-875c22a5c4c9), (spark.tokenServiceEndpoint,tokenservice2.westeurope.azuresynapse.net:443), (spark.trackingUrl.enabled,true), (spark.ui.advise.hub.impl.class,org.apache.spark.advise.DefaultAdviseHub), (spark.ui.enhancement.enabled,true), (spark.ui.filters,org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter), (spark.ui.port,0), (spark.ui.prometheus.enabled,true), (spark.unsafe.sorter.spill.reader.buffer.size,1m), (spark.yarn.am.waitTime,100s), (spark.yarn.app.container.log.dir,/var/log/yarn-nm/userlogs/application_1718175835080_0001/container_1718175835080_0001_01_000001), (spark.yarn.app.id,application_1718175835080_0001), (spark.yarn.appMasterEnv.AZUREML_ARM_PROJECT_NAME,umico-ds-loyalty-feed), (spark.yarn.appMasterEnv.AZUREML_ARM_RESOURCEGROUP,ds-resources), (spark.yarn.appMasterEnv.AZUREML_ARM_SUBSCRIPTION,14e4c1c9-5437-4eb8-8dad-45696707c729), (spark.yarn.appMasterEnv.AZUREML_ARM_WORKSPACE_NAME,ds-workspace), (spark.yarn.appMasterEnv.AZUREML_ARTIFACT_SAS_TOKEN,sv=2019-07-07&sr=c&sig=r%2FqGQLMF3SKbXLB2CeuQwVO94zsqLGxlMVb2e7RxvC8%3D&st=2024-06-12T06%3A52%3A29Z&se=2024-06-26T07%3A02%3A29Z&sp=racwdl), (spark.yarn.appMasterEnv.AZUREML_COMMUNICATOR,None), (spark.yarn.appMasterEnv.AZUREML_COMPUTE_RECORD_ARTIFACT_ORIGIN,ComputeRecord), (spark.yarn.appMasterEnv.AZUREML_COMPUTE_RECORD_ARTIFACT_PATH,compute_record.txt), (spark.yarn.appMasterEnv.AZUREML_CONTEXT_MANAGER_PROJECTPYTHONPATH,bnVsbA==), (spark.yarn.appMasterEnv.AZUREML_CONTEXT_MANAGER_TRACKUSERERROR,eyJTa2lwSGlzdG9yeUltcG9ydENoZWNrIjoiVHJ1ZSJ9), (spark.yarn.appMasterEnv.AZUREML_CONTROLLOG_PATH,azureml-logs/control_log.txt), (spark.yarn.appMasterEnv.AZUREML_CURRENT_CLOUD,AzureCloud), (spark.yarn.appMasterEnv.AZUREML_CURRENT_CLOUD_METADATA,{"Portal":"https://portal.azure.com","Authentication":{"AzureDataLakeStoreFileSystem":null,"SqlServerHostname":null,"AzureDataLakeAnalyticsCatalogAndJob":null,"KeyVaultDns":null,"Storage":null,"AzureFrontDoorEndpointSuffix":null},"Media":"https://rest.media.azure.net","GraphAudience":"https://graph.windows.net/","Graph":"https://graph.windows.net/","Name":"AzureCloud","Suffixes":{"LoginEndpoint":null,"Audiences":null,"Tenant":null,"IdentityProvider":null},"Batch":"https://batch.core.windows.net/","ResourceManager":"https://management.azure.com/","VmImageAliasDoc":"https://raw.githubusercontent.com/Azure/azure-rest-api-specs/master/arm-compute/quickstart-templates/aliases.json","ActiveDirectoryDataLake":"https://datalake.azure.net/","SqlManagement":"https://management.core.windows.net:8443/","Gallery":"https://gallery.azure.com/"}), (spark.yarn.appMasterEnv.AZUREML_DATAPREP_TOKEN_PROVIDER,sparkobo), (spark.yarn.appMasterEnv.AZUREML_DATASET_FILE_OUTPUTS,AZURE_ML_OUTPUT_wrangled_data), (spark.yarn.appMasterEnv.AZUREML_DATA_CONTAINER_ID,dcid.red_seed_n9v4kjsm16), (spark.yarn.appMasterEnv.AZUREML_DISCOVERY_SERVICE_ENDPOINT,https://westeurope.api.azureml.ms/discovery), (spark.yarn.appMasterEnv.AZUREML_DRIVERLOG_PATH,azureml-logs/driver_log.txt), (spark.yarn.appMasterEnv.AZUREML_EXPERIMENT_ID,63a713bb-3fa2-48af-a358-bc0522d4b3e1), (spark.yarn.appMasterEnv.AZUREML_EXPERIMENT_SCOPE,/subscriptions/14e4c1c9-5437-4eb8-8dad-45696707c729/resourceGroups/ds-resources/providers/Microsoft.MachineLearningServices/workspaces/ds-workspace/experiments/umico-ds-loyalty-feed), (spark.yarn.appMasterEnv.AZUREML_FRAMEWORK,Python), (spark.yarn.appMasterEnv.AZUREML_INSTRUMENTATION_KEY,fb7e27a4-f865-4147-83ee-ffbf79d1a9f5), (spark.yarn.appMasterEnv.AZUREML_JOBPREPLOG_PATH,azureml-logs/job_prep_log.txt), (spark.yarn.appMasterEnv.AZUREML_JOBRELEASELOG_PATH,azureml-logs/job_release_log.txt), (spark.yarn.appMasterEnv.AZUREML_LINK_DATASET_OUTPUTS,), (spark.yarn.appMasterEnv.AZUREML_LOGDIRECTORY_PATH,azureml-logs/), (spark.yarn.appMasterEnv.AZUREML_OBO_ACCESS_TOKEN,eyJhbGciOiJSUzI1NiIsImtpZCI6IjI0Nzc2OEE4Rjc2OUVGRUFFMjk1QzU5QTExNkU5NjA5MDNBOTBGMkYiLCJ0eXAiOiJKV1QifQ.eyJyb2xlIjoiQ29udHJpYnV0b3IiLCJzY29wZSI6Ii9zdWJzY3JpcHRpb25zLzE0ZTRjMWM5LTU0MzctNGViOC04ZGFkLTQ1Njk2NzA3YzcyOS9yZXNvdXJjZUdyb3Vwcy9kcy1yZXNvdXJjZXMvcHJvdmlkZXJzL01pY3Jvc29mdC5NYWNoaW5lTGVhcm5pbmdTZXJ2aWNlcy93b3Jrc3BhY2VzL2RzLXdvcmtzcGFjZS9leHBlcmltZW50TmFtZS91bWljby1kcy1sb3lhbHR5LWZlZWQvcnVuSWQvcmVkX3NlZWRfbjl2NGtqc20xNiIsImFjY291bnRpZCI6IjAwMDAwMDAwLTAwMDAtMDAwMC0wMDAwLTAwMDAwMDAwMDAwMCIsIndvcmtzcGFjZUlkIjoiMDAwMDAwMDAtMDAwMC0wMDAwLTAwMDAtMDAwMDAwMDAwMDAwIiwicHJvamVjdGlkIjoiMDAwMDAwMDAtMDAwMC0wMDAwLTAwMDAtMDAwMDAwMDAwMDAwIiwiZGlzY292ZXJ5IjoidXJpOi8vZGlzY292ZXJ5dXJpLyIsInRpZCI6ImNhYTk1MDY4LTdjYzEtNGM0MS05MjVmLTg3NWMyMmE1YzRjOSIsIm9pZCI6IjE4OTFhNTUwLTQ3MDQtNGE1MS05NTNkLWE1ZTMzNGI3OTRmNSIsInB1aWQiOiIxMDAzMjAwMzIxMTZBM0VEIiwiaXNzIjoiYXp1cmVtbCIsImV4cCI6MTcxOTkwMzcxNCwiYXVkIjoiYXp1cmVtbCJ9.CYO9tBokxw5fqbaOoLDK5A5kZQfG8iqhET8GGNxxJPAHizmKPm8skfsUYWa3ucdTKqK6L6np0rUzwrht49ctua2G0ynBXGKuytVpBXpYuAZY_XcJQkkUiNp25sOtfYVyCrOYr-BNDwjwrcXHIgcV-L5mCBqw-1sVQZy0YBxoCRFEjJEpf_3Jp58H8JZ00X8kcWSpxz1La3tFbYNEF4j1TVokaQlw_hjyJLb0mFx-2kPaPgmITC1ZEvzXuuxlJ93BjwPZyie-3dWKoADBpgWFFJ0T5DaEcyF_gnbbFu9MgXKG5Io9jY1LIVo7x-ElIOgX4a_TNt6UHmKG90zxV6WJEA), (spark.yarn.appMasterEnv.AZUREML_OBO_CANARY_TOKEN,eyJhbGciOiJSUzI1NiIsImtpZCI6IjI0Nzc2OEE4Rjc2OUVGRUFFMjk1QzU5QTExNkU5NjA5MDNBOTBGMkYiLCJ0eXAiOiJKV1QifQ.eyJyb2xlIjoiQ29udHJpYnV0b3IiLCJzY29wZSI6Ii9zdWJzY3JpcHRpb25zLzE0ZTRjMWM5LTU0MzctNGViOC04ZGFkLTQ1Njk2NzA3YzcyOS9yZXNvdXJjZUdyb3Vwcy9kcy1yZXNvdXJjZXMvcHJvdmlkZXJzL01pY3Jvc29mdC5NYWNoaW5lTGVhcm5pbmdTZXJ2aWNlcy93b3Jrc3BhY2VzL2RzLXdvcmtzcGFjZS9leHBlcmltZW50TmFtZS91bWljby1kcy1sb3lhbHR5LWZlZWQvcnVuSWQvcmVkX3NlZWRfbjl2NGtqc20xNiIsImFjY291bnRpZCI6IjAwMDAwMDAwLTAwMDAtMDAwMC0wMDAwLTAwMDAwMDAwMDAwMCIsIndvcmtzcGFjZUlkIjoiMDAwMDAwMDAtMDAwMC0wMDAwLTAwMDAtMDAwMDAwMDAwMDAwIiwicHJvamVjdGlkIjoiMDAwMDAwMDAtMDAwMC0wMDAwLTAwMDAtMDAwMDAwMDAwMDAwIiwiZGlzY292ZXJ5IjoidXJpOi8vZGlzY292ZXJ5dXJpLyIsInRpZCI6ImNhYTk1MDY4LTdjYzEtNGM0MS05MjVmLTg3NWMyMmE1YzRjOSIsIm9pZCI6IjE4OTFhNTUwLTQ3MDQtNGE1MS05NTNkLWE1ZTMzNGI3OTRmNSIsInB1aWQiOiIxMDAzMjAwMzIxMTZBM0VEIiwiaXNzIjoiYXp1cmVtbCIsImV4cCI6MTcxOTkwMzcxNCwiYXVkIjoiYXp1cmVtbCJ9.CYO9tBokxw5fqbaOoLDK5A5kZQfG8iqhET8GGNxxJPAHizmKPm8skfsUYWa3ucdTKqK6L6np0rUzwrht49ctua2G0ynBXGKuytVpBXpYuAZY_XcJQkkUiNp25sOtfYVyCrOYr-BNDwjwrcXHIgcV-L5mCBqw-1sVQZy0YBxoCRFEjJEpf_3Jp58H8JZ00X8kcWSpxz1La3tFbYNEF4j1TVokaQlw_hjyJLb0mFx-2kPaPgmITC1ZEvzXuuxlJ93BjwPZyie-3dWKoADBpgWFFJ0T5DaEcyF_gnbbFu9MgXKG5Io9jY1LIVo7x-ElIOgX4a_TNt6UHmKG90zxV6WJEA), (spark.yarn.appMasterEnv.AZUREML_OBO_ENABLED,True), (spark.yarn.appMasterEnv.AZUREML_OBO_SERVICE_ENDPOINT,https://westeurope.api.azureml.ms), (spark.yarn.appMasterEnv.AZUREML_OBO_USER_TOKEN_FOR_SPARK_RETRIEVAL_API,getuseraccesstokenforspark), (spark.yarn.appMasterEnv.AZUREML_PIDFILE_PATH,azureml-setup/pid.txt), (spark.yarn.appMasterEnv.AZUREML_ROOT_RUN_ID,red_seed_n9v4kjsm16), (spark.yarn.appMasterEnv.AZUREML_RUN_CONFIGURATION,azureml-setup/mutated_run_configuration.json), (spark.yarn.appMasterEnv.AZUREML_RUN_HISTORY_SERVICE_ENDPOINT,https://westeurope.api.azureml.ms), (spark.yarn.appMasterEnv.AZUREML_RUN_ID,red_seed_n9v4kjsm16), (spark.yarn.appMasterEnv.AZUREML_RUN_TOKEN,eyJhbGciOiJSUzI1NiIsImtpZCI6IjI0Nzc2OEE4Rjc2OUVGRUFFMjk1QzU5QTExNkU5NjA5MDNBOTBGMkYiLCJ0eXAiOiJKV1QifQ.eyJyb2xlIjoiQ29udHJpYnV0b3IiLCJzY29wZSI6Ii9zdWJzY3JpcHRpb25zLzE0ZTRjMWM5LTU0MzctNGViOC04ZGFkLTQ1Njk2NzA3YzcyOS9yZXNvdXJjZUdyb3Vwcy9kcy1yZXNvdXJjZXMvcHJvdmlkZXJzL01pY3Jvc29mdC5NYWNoaW5lTGVhcm5pbmdTZXJ2aWNlcy93b3Jrc3BhY2VzL2RzLXdvcmtzcGFjZSIsImFjY291bnRpZCI6IjAwMDAwMDAwLTAwMDAtMDAwMC0wMDAwLTAwMDAwMDAwMDAwMCIsIndvcmtzcGFjZUlkIjoiMjVkODI3ZjMtY2YxMC00ZTJhLWI2NWQtNDBjMzE2ODEyZGRkIiwicHJvamVjdGlkIjoiMDAwMDAwMDAtMDAwMC0wMDAwLTAwMDAtMDAwMDAwMDAwMDAwIiwiZGlzY292ZXJ5IjoidXJpOi8vZGlzY292ZXJ5dXJpLyIsInRpZCI6ImNhYTk1MDY4LTdjYzEtNGM0MS05MjVmLTg3NWMyMmE1YzRjOSIsIm9pZCI6IjE4OTFhNTUwLTQ3MDQtNGE1MS05NTNkLWE1ZTMzNGI3OTRmNSIsInB1aWQiOiIxMDAzMjAwMzIxMTZBM0VEIiwiaXNzIjoiYXp1cmVtbCIsImFwcGlkIjoiQmFraHJ1eiBEemhhZmFyb3YiLCJleHAiOjE3MTk5OTczMTQsImF1ZCI6ImF6dXJlbWwifQ.qMiL7T_ZZwGiGLKd9yOQLAI9KvI-W4_xIDoPsl42rL6Gu1uJGlbD5XxOAcfoVzFnKAH7tooTKsVvWUQ4xV9EoHoEwlIQo3psQKnL2QRZ4mxB8i6vEOU8vEu0oVwMvPwoEP3fcw1cPCSrPOSYkexvIqFoQj5HUtbQibEmcFQUhRgDW1G0pT5UWO_uyuro_pR5enTxwNS3F6MPQKHTXgBGl_4nyM1K21yLEvp2P7jUfz7C3sG_rdcAeWA4IZWAYWZUfOO2Zi2d9Id7WnX8yOekNldT1aUFBUwajrkFZmR5qpzqGl0VkmMcGNz2cJCxydbUiIXqG3zU3MHh-oPnj5giKw), (spark.yarn.appMasterEnv.AZUREML_RUN_TOKEN_EXPIRY,1719997314), (spark.yarn.appMasterEnv.AZUREML_RUN_TOKEN_PASS,fe411a1a-722e-403b-820f-6ac5ec1822f4), (spark.yarn.appMasterEnv.AZUREML_RUN_TOKEN_RAND,5c494a1b-7f28-4af2-9ec0-a9369dd98e4b), (spark.yarn.appMasterEnv.AZUREML_SERVICE_CERT_ENDPOINT,https://westeurope.cert.api.azureml.ms), (spark.yarn.appMasterEnv.AZUREML_SERVICE_ENDPOINT,https://westeurope.api.azureml.ms), (spark.yarn.appMasterEnv.AZUREML_USER_OBJECT_ID,1891a550-4704-4a51-953d-a5e334b794f5), (spark.yarn.appMasterEnv.AZUREML_USER_SCRIPTS_VALIDATION_AS_FATAL,true), (spark.yarn.appMasterEnv.AZUREML_USER_TENANT_ID,caa95068-7cc1-4c41-925f-875c22a5c4c9), (spark.yarn.appMasterEnv.AZUREML_WORKSPACE_ID,25d827f3-cf10-4e2a-b65d-40c316812ddd), (spark.yarn.appMasterEnv.AZUREML_WORKSPACE_SCOPE,/subscriptions/14e4c1c9-5437-4eb8-8dad-45696707c729/resourceGroups/ds-resources/providers/Microsoft.MachineLearningServices/workspaces/ds-workspace), (spark.yarn.appMasterEnv.AZURE_ML_OUTPUT_wrangled_data,azureml://subscriptions/14e4c1c9-5437-4eb8-8dad-45696707c729/resourcegroups/ds-resources/workspaces/ds-workspace/datastores/loy_ds_recommendation_lenta/paths/), (spark.yarn.appMasterEnv.AZURE_SERVICE,Microsoft.ProjectArcadia), (spark.yarn.appMasterEnv.DOTNET_WORKER_2_1_0_DIR,/usr/local/bin/sparkdotnet/Microsoft.Spark.Worker/2.1.0-prerelease.22115.1), (spark.yarn.appMasterEnv.FAIRLEARN_LOGS,azureml-logs/telemetry_logs/fairlearn_log.txt), (spark.yarn.appMasterEnv.HBI_WORKSPACE_JOB,false), (spark.yarn.appMasterEnv.INTERPRET_C_LOGS,azureml-logs/telemetry_logs/interpret_community_log.txt), (spark.yarn.appMasterEnv.INTERPRET_TEXT_LOGS,azureml-logs/telemetry_logs/interpret_text_log.txt), (spark.yarn.appMasterEnv.JAVA_TOOL_OPTIONS,-Djdk.jar.maxSignatureFileSize=2147483639), (spark.yarn.appMasterEnv.MLFLOW_DISABLE_ENV_MANAGER_CONDA_WARNING,True), (spark.yarn.appMasterEnv.MLFLOW_EXPERIMENT_ID,63a713bb-3fa2-48af-a358-bc0522d4b3e1), (spark.yarn.appMasterEnv.MLFLOW_EXPERIMENT_NAME,umico-ds-loyalty-feed), (spark.yarn.appMasterEnv.MLFLOW_RUN_ID,red_seed_n9v4kjsm16), (spark.yarn.appMasterEnv.MLFLOW_TRACKING_TOKEN,eyJhbGciOiJSUzI1NiIsImtpZCI6IjI0Nzc2OEE4Rjc2OUVGRUFFMjk1QzU5QTExNkU5NjA5MDNBOTBGMkYiLCJ0eXAiOiJKV1QifQ.eyJyb2xlIjoiQ29udHJpYnV0b3IiLCJzY29wZSI6Ii9zdWJzY3JpcHRpb25zLzE0ZTRjMWM5LTU0MzctNGViOC04ZGFkLTQ1Njk2NzA3YzcyOS9yZXNvdXJjZUdyb3Vwcy9kcy1yZXNvdXJjZXMvcHJvdmlkZXJzL01pY3Jvc29mdC5NYWNoaW5lTGVhcm5pbmdTZXJ2aWNlcy93b3Jrc3BhY2VzL2RzLXdvcmtzcGFjZSIsImFjY291bnRpZCI6IjAwMDAwMDAwLTAwMDAtMDAwMC0wMDAwLTAwMDAwMDAwMDAwMCIsIndvcmtzcGFjZUlkIjoiMjVkODI3ZjMtY2YxMC00ZTJhLWI2NWQtNDBjMzE2ODEyZGRkIiwicHJvamVjdGlkIjoiMDAwMDAwMDAtMDAwMC0wMDAwLTAwMDAtMDAwMDAwMDAwMDAwIiwiZGlzY292ZXJ5IjoidXJpOi8vZGlzY292ZXJ5dXJpLyIsInRpZCI6ImNhYTk1MDY4LTdjYzEtNGM0MS05MjVmLTg3NWMyMmE1YzRjOSIsIm9pZCI6IjE4OTFhNTUwLTQ3MDQtNGE1MS05NTNkLWE1ZTMzNGI3OTRmNSIsInB1aWQiOiIxMDAzMjAwMzIxMTZBM0VEIiwiaXNzIjoiYXp1cmVtbCIsImFwcGlkIjoiQmFraHJ1eiBEemhhZmFyb3YiLCJleHAiOjE3MTk5OTczMTQsImF1ZCI6ImF6dXJlbWwifQ.qMiL7T_ZZwGiGLKd9yOQLAI9KvI-W4_xIDoPsl42rL6Gu1uJGlbD5XxOAcfoVzFnKAH7tooTKsVvWUQ4xV9EoHoEwlIQo3psQKnL2QRZ4mxB8i6vEOU8vEu0oVwMvPwoEP3fcw1cPCSrPOSYkexvIqFoQj5HUtbQibEmcFQUhRgDW1G0pT5UWO_uyuro_pR5enTxwNS3F6MPQKHTXgBGl_4nyM1K21yLEvp2P7jUfz7C3sG_rdcAeWA4IZWAYWZUfOO2Zi2d9Id7WnX8yOekNldT1aUFBUwajrkFZmR5qpzqGl0VkmMcGNz2cJCxydbUiIXqG3zU3MHh-oPnj5giKw), (spark.yarn.appMasterEnv.MLFLOW_TRACKING_URI,azureml://westeurope.api.azureml.ms/mlflow/v1.0/subscriptions/14e4c1c9-5437-4eb8-8dad-45696707c729/resourceGroups/ds-resources/providers/Microsoft.MachineLearningServices/workspaces/ds-workspace), (spark.yarn.appMasterEnv.MMLSPARK_PLATFORM_INFO,synapse), (spark.yarn.appMasterEnv.OID,1891a550-4704-4a51-953d-a5e334b794f5), (spark.yarn.appMasterEnv.PATH,/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin:/usr/local/cuda-11.5/bin:/home/trusted-service-user/cluster-env/env/bin:/home/trusted-service-user/cluster-env/synapse_trident_r/bin), (spark.yarn.appMasterEnv.PYSPARK_PYTHON,/home/trusted-service-user/cluster-env/env/bin/python), (spark.yarn.appMasterEnv.PYTHONUNBUFFERED,True), (spark.yarn.appMasterEnv.SPARKR_INLINE_SESSION_LEVEL_ENABLE,true), (spark.yarn.appMasterEnv.TELEMETRY_LOGS,azureml-logs/telemetry_logs/), (spark.yarn.appMasterEnv.TID,caa95068-7cc1-4c41-925f-875c22a5c4c9), (spark.yarn.containerLauncherMaxThreads,25), (spark.yarn.dist.archives,), (spark.yarn.dist.files,), (spark.yarn.dist.jars,), (spark.yarn.dist.pyFiles,wasbs://azureml-blobstore-25d827f3-cf10-4e2a-b65d-40c316812ddd@dsmlstoragexfsyt.blob.core.windows.net/azureml/red_seed_n9v4kjsm16/source.zip,wasbs://azureml-blobstore-25d827f3-cf10-4e2a-b65d-40c316812ddd@dsmlstoragexfsyt.blob.core.windows.net/azureml/red_seed_n9v4kjsm16/setup.zip), (spark.yarn.executor.decommission.enabled,true), (spark.yarn.isPython,true), (spark.yarn.jars,local:///opt/spark/jars/*), (spark.yarn.maxAppAttempts,1), (spark.yarn.populateHadoopClasspath.overWrite,true), (spark.yarn.preserve.staging.files,false), (spark.yarn.queue,default), (spark.yarn.scheduler.heartbeat.interval-ms,1000), (spark.yarn.stagingDir,wasbs://9e2f8fd9-9d5f-4acd-99b5-3885490a4d31@hobostoragenue9ivxr1n.blob.core.windows.net/user/trusted-service-user/), (spark.yarn.submit.waitAppCompletion,false), (spark.yarn.tags,livy-batch-0-OXMTLgZk)), Classpath Entries -> Vector((/mnt/var/hadoop/tmp/nm-local-dir/usercache/trusted-service-user/appcache/application_1718175835080_0001/container_1718175835080_0001_01_000001,System Classpath), (/mnt/var/hadoop/tmp/nm-local-dir/usercache/trusted-service-user/appcache/application_1718175835080_0001/container_1718175835080_0001_01_000001/__spark_conf__,System Classpath), (/mnt/var/hadoop/tmp/nm-local-dir/usercache/trusted-service-user/appcache/application_1718175835080_0001/container_1718175835080_0001_01_000001/__spark_conf__/__hadoop_conf__,System Classpath), (/mnt/var/hadoop/tmp/nm-local-dir/usercache/trusted-service-user/appcache/application_1718175835080_0001/container_1718175835080_0001_01_000001/__spark_libs__/*,System Classpath), (/opt/spark/jars/HikariCP-2.5.1.jar,System Classpath), (/opt/spark/jars/JLargeArrays-1.5.jar,System Classpath), (/opt/spark/jars/JTransforms-3.1.jar,System Classpath), (/opt/spark/jars/RoaringBitmap-0.9.25.jar,System Classpath), (/opt/spark/jars/ST4-4.0.4.jar,System Classpath), (/opt/spark/jars/SparkCustomEvents-3.3.0-1.0.3.jar,System Classpath), (/opt/spark/jars/TokenLibrary-assembly-3.6.4.jar,System Classpath), (/opt/spark/jars/VegasConnector-3.3.09.jar,System Classpath), (/opt/spark/jars/activation-1.1.1.jar,System Classpath), (/opt/spark/jars/aircompressor-0.21.jar,System Classpath), (/opt/spark/jars/algebra_2.12-2.0.1.jar,System Classpath), (/opt/spark/jars/aliyun-java-sdk-core-4.5.10.jar,System Classpath), (/opt/spark/jars/aliyun-java-sdk-kms-2.11.0.jar,System Classpath), (/opt/spark/jars/aliyun-java-sdk-ram-3.1.0.jar,System Classpath), (/opt/spark/jars/aliyun-sdk-oss-3.13.0.jar,System Classpath), (/opt/spark/jars/annotations-17.0.0.jar,System Classpath), (/opt/spark/jars/antlr-runtime-3.5.2.jar,System Classpath), (/opt/spark/jars/antlr4-runtime-4.8.jar,System Classpath), (/opt/spark/jars/aopalliance-repackaged-2.6.1.jar,System Classpath), (/opt/spark/jars/apiguardian-api-1.1.0.jar,System Classpath), (/opt/spark/jars/arpack-2.2.1.jar,System Classpath), (/opt/spark/jars/arpack_combined_all-0.1.jar,System Classpath), (/opt/spark/jars/arrow-format-7.0.0.jar,System Classpath), (/opt/spark/jars/arrow-memory-core-7.0.0.jar,System Classpath), (/opt/spark/jars/arrow-memory-netty-7.0.0.jar,System Classpath), (/opt/spark/jars/arrow-vector-7.0.0.jar,System Classpath), (/opt/spark/jars/audience-annotations-0.5.0.jar,System Classpath), (/opt/spark/jars/autotune-client_2.12-1.11.1-3.3-125200601.jar,System Classpath), (/opt/spark/jars/autotune-common_2.12-1.11.1-3.3-125200601.jar,System Classpath), (/opt/spark/jars/avro-1.11.0.jar,System Classpath), (/opt/spark/jars/avro-ipc-1.11.0.jar,System Classpath), (/opt/spark/jars/avro-mapred-1.11.0.jar,System Classpath), (/opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar,System Classpath), (/opt/spark/jars/azure-data-lake-store-sdk-2.3.9.jar,System Classpath), (/opt/spark/jars/azure-eventhubs-3.3.0.jar,System Classpath), (/opt/spark/jars/azure-eventhubs-spark_2.12-2.3.22.jar,System Classpath), (/opt/spark/jars/azure-keyvault-core-1.0.0.jar,System Classpath), (/opt/spark/jars/azure-storage-7.0.1.jar,System Classpath), (/opt/spark/jars/azure-synapse-ml-pandas_2.12-0.1.1.jar,System Classpath), (/opt/spark/jars/azure-synapse-ml-predict_2.12-1.0.jar,System Classpath), (/opt/spark/jars/blas-2.2.1.jar,System Classpath), (/opt/spark/jars/bonecp-0.8.0.RELEASE.jar,System Classpath), (/opt/spark/jars/breeze-macros_2.12-1.2.jar,System Classpath), (/opt/spark/jars/breeze_2.12-1.2.jar,System Classpath), (/opt/spark/jars/cats-kernel_2.12-2.1.1.jar,System Classpath), (/opt/spark/jars/chill-java-0.10.0.jar,System Classpath), (/opt/spark/jars/chill_2.12-0.10.0.jar,System Classpath), (/opt/spark/jars/client-sdk-1.24.1.jar,System Classpath), (/opt/spark/jars/commons-cli-1.5.0.jar,System Classpath), (/opt/spark/jars/commons-codec-1.15.jar,System Classpath), (/opt/spark/jars/commons-collections-3.2.2.jar,System Classpath), (/opt/spark/jars/commons-collections4-4.4.jar,System Classpath), (/opt/spark/jars/commons-compiler-3.0.16.jar,System Classpath), (/opt/spark/jars/commons-compress-1.21.jar,System Classpath), (/opt/spark/jars/commons-crypto-1.1.0.jar,System Classpath), (/opt/spark/jars/commons-dbcp-1.4.jar,System Classpath), (/opt/spark/jars/commons-io-2.11.0.jar,System Classpath), (/opt/spark/jars/commons-lang-2.6.jar,System Classpath), (/opt/spark/jars/commons-lang3-3.12.0.jar,System Classpath), (/opt/spark/jars/commons-logging-1.1.3.jar,System Classpath), (/opt/spark/jars/commons-math3-3.6.1.jar,System Classpath), (/opt/spark/jars/commons-pool-1.5.4.jar,System Classpath), (/opt/spark/jars/commons-pool2-2.11.1.jar,System Classpath), (/opt/spark/jars/commons-text-1.10.0.jar,System Classpath), (/opt/spark/jars/compress-lzf-1.1.jar,System Classpath), (/opt/spark/jars/config-1.3.4.jar,System Classpath), (/opt/spark/jars/core-1.1.2.jar,System Classpath), (/opt/spark/jars/cos_api-bundle-5.6.19.jar,System Classpath), (/opt/spark/jars/cosmos-analytics-spark-3.3.1-connector-1.8.10.jar,System Classpath), (/opt/spark/jars/curator-client-2.13.0.jar,System Classpath), (/opt/spark/jars/curator-framework-2.13.0.jar,System Classpath), (/opt/spark/jars/curator-recipes-2.13.0.jar,System Classpath), (/opt/spark/jars/datanucleus-api-jdo-4.2.4.jar,System Classpath), (/opt/spark/jars/datanucleus-core-4.1.17.jar,System Classpath), (/opt/spark/jars/datanucleus-rdbms-4.1.19.jar,System Classpath), (/opt/spark/jars/delta-core_2.12-2.2.0.12.jar,System Classpath), (/opt/spark/jars/delta-iceberg_2.12-2.2.0.12.jar,System Classpath), (/opt/spark/jars/delta-storage-2.2.0.12.jar,System Classpath), (/opt/spark/jars/derby-10.14.2.0.jar,System Classpath), (/opt/spark/jars/dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar,System Classpath), (/opt/spark/jars/flatbuffers-java-1.12.0.jar,System Classpath), (/opt/spark/jars/fluent-logger-jar-with-dependencies-jdk8.jar,System Classpath), (/opt/spark/jars/genesis-client_2.12-0.27.0-jar-with-dependencies.jar,System Classpath), (/opt/spark/jars/gluten-velox-bundle-spark3.3_2.12-ubuntu_18.04-0.5.0-SNAPSHOT.jar,System Classpath), (/opt/spark/jars/gson-2.8.9.jar,System Classpath), (/opt/spark/jars/guava-14.0.1.jar,System Classpath), (/opt/spark/jars/hadoop-aliyun-3.3.3.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/hadoop-annotations-3.3.3.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/hadoop-aws-3.3.3.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/hadoop-azure-3.3.3.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/hadoop-azure-datalake-3.3.3.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/hadoop-azureml-1.0-fs.jar,System Classpath), (/opt/spark/jars/hadoop-client-api-3.3.3.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/hadoop-client-runtime-3.3.3.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/hadoop-cloud-storage-3.3.3.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/hadoop-cos-3.3.3.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/hadoop-openstack-3.3.3.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/hadoop-shaded-guava-1.1.1.jar,System Classpath), (/opt/spark/jars/hadoop-yarn-server-web-proxy-3.3.3.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/hdinsight-spark-metrics-3.3.0-1.0.3.jar,System Classpath), (/opt/spark/jars/hive-beeline-2.3.9.jar,System Classpath), (/opt/spark/jars/hive-cli-2.3.9.jar,System Classpath), (/opt/spark/jars/hive-common-2.3.9.jar,System Classpath), (/opt/spark/jars/hive-exec-2.3.9-core.jar,System Classpath), (/opt/spark/jars/hive-jdbc-2.3.9.jar,System Classpath), (/opt/spark/jars/hive-llap-common-2.3.9.jar,System Classpath), (/opt/spark/jars/hive-metastore-2.3.9.jar,System Classpath), (/opt/spark/jars/hive-serde-2.3.9.jar,System Classpath), (/opt/spark/jars/hive-service-rpc-3.1.2.jar,System Classpath), (/opt/spark/jars/hive-shims-0.23-2.3.9.jar,System Classpath), (/opt/spark/jars/hive-shims-2.3.9.jar,System Classpath), (/opt/spark/jars/hive-shims-common-2.3.9.jar,System Classpath), (/opt/spark/jars/hive-shims-scheduler-2.3.9.jar,System Classpath), (/opt/spark/jars/hive-storage-api-2.7.2.jar,System Classpath), (/opt/spark/jars/hive-vector-code-gen-2.3.9.jar,System Classpath), (/opt/spark/jars/hk2-api-2.6.1.jar,System Classpath), (/opt/spark/jars/hk2-locator-2.6.1.jar,System Classpath), (/opt/spark/jars/hk2-utils-2.6.1.jar,System Classpath), (/opt/spark/jars/httpclient-4.5.13.jar,System Classpath), (/opt/spark/jars/httpclient5-5.1.3.jar,System Classpath), (/opt/spark/jars/httpcore-4.4.14.jar,System Classpath), (/opt/spark/jars/httpmime-4.5.13.jar,System Classpath), (/opt/spark/jars/impulse-core_spark3.3_2.12-1.0.19.jar,System Classpath), (/opt/spark/jars/impulse-telemetry-mds_spark3.3_2.12-1.0.19.jar,System Classpath), (/opt/spark/jars/ini4j-0.5.4.jar,System Classpath), (/opt/spark/jars/isolation-forest_3.3.3_2.12-3.0.4.jar,System Classpath), (/opt/spark/jars/istack-commons-runtime-3.0.8.jar,System Classpath), (/opt/spark/jars/ivy-2.5.1.jar,System Classpath), (/opt/spark/jars/jackson-annotations-2.13.4.jar,System Classpath), (/opt/spark/jars/jackson-core-2.13.4.jar,System Classpath), (/opt/spark/jars/jackson-core-asl-1.9.13.jar,System Classpath), (/opt/spark/jars/jackson-databind-2.13.4.2.jar,System Classpath), (/opt/spark/jars/jackson-dataformat-cbor-2.13.4.jar,System Classpath), (/opt/spark/jars/jackson-mapper-asl-1.9.13.jar,System Classpath), (/opt/spark/jars/jackson-module-scala_2.12-2.13.4.jar,System Classpath), (/opt/spark/jars/jakarta.annotation-api-1.3.5.jar,System Classpath), (/opt/spark/jars/jakarta.inject-2.6.1.jar,System Classpath), (/opt/spark/jars/jakarta.servlet-api-4.0.3.jar,System Classpath), (/opt/spark/jars/jakarta.validation-api-2.0.2.jar,System Classpath), (/opt/spark/jars/jakarta.ws.rs-api-2.1.6.jar,System Classpath), (/opt/spark/jars/jakarta.xml.bind-api-2.3.2.jar,System Classpath), (/opt/spark/jars/janino-3.0.16.jar,System Classpath), (/opt/spark/jars/javassist-3.25.0-GA.jar,System Classpath), (/opt/spark/jars/javatuples-1.2.jar,System Classpath), (/opt/spark/jars/javax.jdo-3.2.0-m3.jar,System Classpath), (/opt/spark/jars/javolution-5.5.1.jar,System Classpath), (/opt/spark/jars/jaxb-api-2.2.11.jar,System Classpath), (/opt/spark/jars/jaxb-runtime-2.3.2.jar,System Classpath), (/opt/spark/jars/jcl-over-slf4j-1.7.32.jar,System Classpath), (/opt/spark/jars/jdo-api-3.0.1.jar,System Classpath), (/opt/spark/jars/jdom2-2.0.6.jar,System Classpath), (/opt/spark/jars/jersey-client-2.36.jar,System Classpath), (/opt/spark/jars/jersey-common-2.36.jar,System Classpath), (/opt/spark/jars/jersey-container-servlet-2.36.jar,System Classpath), (/opt/spark/jars/jersey-container-servlet-core-2.36.jar,System Classpath), (/opt/spark/jars/jersey-hk2-2.36.jar,System Classpath), (/opt/spark/jars/jersey-server-2.36.jar,System Classpath), (/opt/spark/jars/jettison-1.1.jar,System Classpath), (/opt/spark/jars/jetty-util-9.4.53.v20231009.jar,System Classpath), (/opt/spark/jars/jetty-util-ajax-9.4.53.v20231009.jar,System Classpath), (/opt/spark/jars/jline-2.14.6.jar,System Classpath), (/opt/spark/jars/joda-time-2.10.13.jar,System Classpath), (/opt/spark/jars/jodd-core-3.5.2.jar,System Classpath), (/opt/spark/jars/jpam-1.1.jar,System Classpath), (/opt/spark/jars/jsch-0.1.54.jar,System Classpath), (/opt/spark/jars/json-1.8.jar,System Classpath), (/opt/spark/jars/json-20090211.jar,System Classpath), (/opt/spark/jars/json-20231013.jar,System Classpath), (/opt/spark/jars/json-simple-1.1.1.jar,System Classpath), (/opt/spark/jars/json-simple-1.1.jar,System Classpath), (/opt/spark/jars/json4s-ast_2.12-3.7.0-M11.jar,System Classpath), (/opt/spark/jars/json4s-core_2.12-3.7.0-M11.jar,System Classpath), (/opt/spark/jars/json4s-jackson_2.12-3.7.0-M11.jar,System Classpath), (/opt/spark/jars/json4s-scalap_2.12-3.7.0-M11.jar,System Classpath), (/opt/spark/jars/jsr305-3.0.0.jar,System Classpath), (/opt/spark/jars/jta-1.1.jar,System Classpath), (/opt/spark/jars/jul-to-slf4j-1.7.32.jar,System Classpath), (/opt/spark/jars/junit-jupiter-5.5.2.jar,System Classpath), (/opt/spark/jars/junit-jupiter-api-5.5.2.jar,System Classpath), (/opt/spark/jars/junit-jupiter-engine-5.5.2.jar,System Classpath), (/opt/spark/jars/junit-jupiter-params-5.5.2.jar,System Classpath), (/opt/spark/jars/junit-platform-commons-1.5.2.jar,System Classpath), (/opt/spark/jars/junit-platform-engine-1.5.2.jar,System Classpath), (/opt/spark/jars/kafka-clients-2.8.1.jar,System Classpath), (/opt/spark/jars/kryo-shaded-4.0.2.jar,System Classpath), (/opt/spark/jars/kusto-data-3.2.1-SynapseFabric.jar,System Classpath), (/opt/spark/jars/kusto-ingest-3.2.1-SynapseFabric.jar,System Classpath), (/opt/spark/jars/kusto-spark_3.0_2.12-3.1.16.jar,System Classpath), (/opt/spark/jars/lapack-2.2.1.jar,System Classpath), (/opt/spark/jars/leveldbjni-all-1.8.jar,System Classpath), (/opt/spark/jars/libfb303-0.9.3.jar,System Classpath), (/opt/spark/jars/libthrift-0.12.0.jar,System Classpath), (/opt/spark/jars/lightgbmlib-3.3.510.jar,System Classpath), (/opt/spark/jars/log4j-1.2-api-2.17.2.jar,System Classpath), (/opt/spark/jars/log4j-api-2.17.2.jar,System Classpath), (/opt/spark/jars/log4j-core-2.17.2.jar,System Classpath), (/opt/spark/jars/log4j-slf4j-impl-2.17.2.jar,System Classpath), (/opt/spark/jars/lz4-java-1.8.0.jar,System Classpath), (/opt/spark/jars/mdsdclientdynamic-2.0.jar,System Classpath), (/opt/spark/jars/metrics-core-4.2.7.jar,System Classpath), (/opt/spark/jars/metrics-graphite-4.2.7.jar,System Classpath), (/opt/spark/jars/metrics-jmx-4.2.7.jar,System Classpath), (/opt/spark/jars/metrics-json-4.2.7.jar,System Classpath), (/opt/spark/jars/metrics-jvm-4.2.7.jar,System Classpath), (/opt/spark/jars/microsoft-catalog-metastore-client-1.1.17.jar,System Classpath), (/opt/spark/jars/microsoft-log4j-etwappender-1.0.jar,System Classpath), (/opt/spark/jars/minlog-1.3.0.jar,System Classpath), (/opt/spark/jars/mlflow-spark-2.1.1.jar,System Classpath), (/opt/spark/jars/mssql-jdbc-12.4.2.jre8.jar,System Classpath), (/opt/spark/jars/mssql-jdbc-8.4.1.jre8.jar,System Classpath), (/opt/spark/jars/mysql-connector-java-8.0.18.jar,System Classpath), (/opt/spark/jars/netty-all-4.1.74.Final.jar,System Classpath), (/opt/spark/jars/netty-buffer-4.1.74.Final.jar,System Classpath), (/opt/spark/jars/netty-codec-4.1.74.Final.jar,System Classpath), (/opt/spark/jars/netty-common-4.1.74.Final.jar,System Classpath), (/opt/spark/jars/netty-handler-4.1.74.Final.jar,System Classpath), (/opt/spark/jars/netty-resolver-4.1.74.Final.jar,System Classpath), (/opt/spark/jars/netty-tcnative-classes-2.0.48.Final.jar,System Classpath), (/opt/spark/jars/netty-transport-4.1.74.Final.jar,System Classpath), (/opt/spark/jars/netty-transport-classes-epoll-4.1.74.Final.jar,System Classpath), (/opt/spark/jars/netty-transport-classes-kqueue-4.1.74.Final.jar,System Classpath), (/opt/spark/jars/netty-transport-native-epoll-4.1.74.Final-linux-aarch_64.jar,System Classpath), (/opt/spark/jars/netty-transport-native-epoll-4.1.74.Final-linux-x86_64.jar,System Classpath), (/opt/spark/jars/netty-transport-native-kqueue-4.1.74.Final-osx-aarch_64.jar,System Classpath), (/opt/spark/jars/netty-transport-native-kqueue-4.1.74.Final-osx-x86_64.jar,System Classpath), (/opt/spark/jars/netty-transport-native-unix-common-4.1.74.Final.jar,System Classpath), (/opt/spark/jars/notebook-utils-3.3.0-20240213.4.jar,System Classpath), (/opt/spark/jars/objenesis-3.2.jar,System Classpath), (/opt/spark/jars/onnx-protobuf_2.12-0.9.3.jar,System Classpath), (/opt/spark/jars/onnxruntime_gpu-1.8.1.jar,System Classpath), (/opt/spark/jars/opencsv-2.3.jar,System Classpath), (/opt/spark/jars/opencv-3.2.0-1.jar,System Classpath), (/opt/spark/jars/opentest4j-1.2.0.jar,System Classpath), (/opt/spark/jars/opentracing-api-0.33.0.jar,System Classpath), (/opt/spark/jars/opentracing-noop-0.33.0.jar,System Classpath), (/opt/spark/jars/opentracing-util-0.33.0.jar,System Classpath), (/opt/spark/jars/orc-core-1.7.6.jar,System Classpath), (/opt/spark/jars/orc-mapreduce-1.7.6.jar,System Classpath), (/opt/spark/jars/orc-shims-1.7.6.jar,System Classpath), (/opt/spark/jars/oro-2.0.8.jar,System Classpath), (/opt/spark/jars/osgi-resource-locator-1.0.3.jar,System Classpath), (/opt/spark/jars/paranamer-2.8.jar,System Classpath), (/opt/spark/jars/parquet-column-1.12.3.jar,System Classpath), (/opt/spark/jars/parquet-common-1.12.3.jar,System Classpath), (/opt/spark/jars/parquet-encoding-1.12.3.jar,System Classpath), (/opt/spark/jars/parquet-format-structures-1.12.3.jar,System Classpath), (/opt/spark/jars/parquet-hadoop-1.12.3.jar,System Classpath), (/opt/spark/jars/parquet-jackson-1.12.3.jar,System Classpath), (/opt/spark/jars/peregrine-spark_3.3.0-0.10.3.jar,System Classpath), (/opt/spark/jars/pickle-1.2.jar,System Classpath), (/opt/spark/jars/postgresql-42.2.9.jar,System Classpath), (/opt/spark/jars/protobuf-java-2.5.0.jar,System Classpath), (/opt/spark/jars/proton-j-0.33.8.jar,System Classpath), (/opt/spark/jars/py4j-0.10.9.5.jar,System Classpath), (/opt/spark/jars/qpid-proton-j-extensions-1.2.4.jar,System Classpath), (/opt/spark/jars/resilience4j-core-1.7.1.jar,System Classpath), (/opt/spark/jars/resilience4j-retry-1.7.1.jar,System Classpath), (/opt/spark/jars/rocksdbjni-6.20.3.jar,System Classpath), (/opt/spark/jars/scala-collection-compat_2.12-2.1.1.jar,System Classpath), (/opt/spark/jars/scala-compiler-2.12.15.jar,System Classpath), (/opt/spark/jars/scala-java8-compat_2.12-0.9.0.jar,System Classpath), (/opt/spark/jars/scala-library-2.12.15.jar,System Classpath), (/opt/spark/jars/scala-parser-combinators_2.12-1.1.2.jar,System Classpath), (/opt/spark/jars/scala-reflect-2.12.15.jar,System Classpath), (/opt/spark/jars/scala-xml_2.12-1.2.0.jar,System Classpath), (/opt/spark/jars/scalactic_2.12-3.2.14.jar,System Classpath), (/opt/spark/jars/shapeless_2.12-2.3.7.jar,System Classpath), (/opt/spark/jars/shims-0.9.25.jar,System Classpath), (/opt/spark/jars/slf4j-api-1.7.32.jar,System Classpath), (/opt/spark/jars/snappy-java-1.1.10.5.jar,System Classpath), (/opt/spark/jars/spark-3.3-advisor-core_2.12-1.0.18.jar,System Classpath), (/opt/spark/jars/spark-3.3-rpc-history-server-app-listener_2.12-1.0.0.jar,System Classpath), (/opt/spark/jars/spark-3.3-rpc-history-server-core_2.12-1.0.0.jar,System Classpath), (/opt/spark/jars/spark-avro_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-catalyst_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-cdm-connector-assembly-spark3.3-1.19.7.jar,System Classpath), (/opt/spark/jars/spark-core_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-enhancement_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-enhancementui_2.12-3.0.0.jar,System Classpath), (/opt/spark/jars/spark-graphx_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-hadoop-cloud_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-hive-thriftserver_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-hive_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-kusto-synapse-connector_3.1_2.12-1.3.4.jar,System Classpath), (/opt/spark/jars/spark-kvstore_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-launcher_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-lighter-contract_2.12-2.0.7_spark-3.3.0.jar,System Classpath), (/opt/spark/jars/spark-lighter-core_2.12-2.0.7_spark-3.3.0.jar,System Classpath), (/opt/spark/jars/spark-microsoft-tools_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-mllib-local_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-mllib_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-network-common_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-network-shuffle_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-repl_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-sketch_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-sql-kafka-0-10_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-sql_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-streaming-kafka-0-10-assembly_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-streaming-kafka-0-10_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-streaming_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-tags_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-token-provider-kafka-0-10_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-unsafe_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark-yarn_2.12-3.3.1.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/spark_diagnostic_cli-2.1.0_spark-3.3.0.jar,System Classpath), (/opt/spark/jars/sparklyr-connector-1.0.0_spark-3.3.1.jar,System Classpath), (/opt/spark/jars/sparknativeparquetwriter_2.12-0.8.1-spark-3.3.jar,System Classpath), (/opt/spark/jars/spire-macros_2.12-0.17.0.jar,System Classpath), (/opt/spark/jars/spire-platform_2.12-0.17.0.jar,System Classpath), (/opt/spark/jars/spire-util_2.12-0.17.0.jar,System Classpath), (/opt/spark/jars/spire_2.12-0.17.0.jar,System Classpath), (/opt/spark/jars/spray-json_2.12-1.3.5.jar,System Classpath), (/opt/spark/jars/sqlanalyticsconnector-3.3.0-2.1.3.jar,System Classpath), (/opt/spark/jars/sqlanalyticsconnector-fabric-3.3.0-1.0.4.jar,System Classpath), (/opt/spark/jars/stax-api-1.0.1.jar,System Classpath), (/opt/spark/jars/stream-2.9.6.jar,System Classpath), (/opt/spark/jars/structuredstreamforspark_2.12-3.2.0-2.3.0.jar,System Classpath), (/opt/spark/jars/super-csv-2.2.0.jar,System Classpath), (/opt/spark/jars/synapseml-cognitive_2.12-1.0.4-spark3.3.jar,System Classpath), (/opt/spark/jars/synapseml-core_2.12-1.0.4-spark3.3.jar,System Classpath), (/opt/spark/jars/synapseml-deep-learning_2.12-1.0.4-spark3.3.jar,System Classpath), (/opt/spark/jars/synapseml-internal_2.12-1.0.4.0-spark3.3.jar,System Classpath), (/opt/spark/jars/synapseml-lightgbm_2.12-1.0.4-spark3.3.jar,System Classpath), (/opt/spark/jars/synapseml-opencv_2.12-1.0.4-spark3.3.jar,System Classpath), (/opt/spark/jars/synapseml-vw_2.12-1.0.4-spark3.3.jar,System Classpath), (/opt/spark/jars/synapseml_2.12-1.0.4-spark3.3.jar,System Classpath), (/opt/spark/jars/synfs-3.3.0-20240213.4.jar,System Classpath), (/opt/spark/jars/threeten-extra-1.5.0.jar,System Classpath), (/opt/spark/jars/tink-1.6.1.jar,System Classpath), (/opt/spark/jars/transaction-api-1.1.jar,System Classpath), (/opt/spark/jars/trident-core-1.2.6.jar,System Classpath), (/opt/spark/jars/tridentsystemtokenlibrary-assembly-1.6.5.jar,System Classpath), (/opt/spark/jars/tridenttokenlibrary-assembly-1.6.5.jar,System Classpath), (/opt/spark/jars/univocity-parsers-2.9.1.jar,System Classpath), (/opt/spark/jars/vavr-0.10.4.jar,System Classpath), (/opt/spark/jars/vavr-match-0.10.4.jar,System Classpath), (/opt/spark/jars/velocity-1.5.jar,System Classpath), (/opt/spark/jars/vw-jni-9.3.0.jar,System Classpath), (/opt/spark/jars/wildfly-openssl-1.0.7.Final.jar,System Classpath), (/opt/spark/jars/xbean-asm9-shaded-4.20.jar,System Classpath), (/opt/spark/jars/xz-1.8.jar,System Classpath), (/opt/spark/jars/zookeeper-3.6.3.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/zookeeper-jute-3.6.3.5.2.20240509.1.jar,System Classpath), (/opt/spark/jars/zstd-jni-1.5.2-1.jar,System Classpath), (/usr/lib/dw-connector/synapse/*,System Classpath), (/usr/lib/library-manager/bin/libraries/scala/*,System Classpath), (spark://vm-58f13156:42075/files/tmpvk6bf_1j.zip,Added By User)), Hadoop Properties -> List((adl.feature.ownerandgroup.enableupn,false), (adl.http.timeout,-1), (azure.catalog.metastore.authentication.class,com.microsoft.catalog.metastore.sasClient.SynapseAuthFilter), (azure.catalog.metastore.authentication.synapseauthenticationfilter.proxyhost,https://rpwesteurope.svc.datafactory.azure.com:44433), (azure.catalog.metastore.endpoint,https://tokenservice2.westeurope.azuresynapse.net:443/api/v1/proxy/subscriptions/14e4c1c9-5437-4eb8-8dad-45696707c729/resourceGroups/ds-resources/providers/Microsoft.Synapse/workspaces/25d827f3-cf10-4e2a-b65d-40c316812ddd/hms), (catalog.metastore.context,synapse), (dfs.client.ignore.namenode.default.kms.uri,false), (dfs.ha.fencing.ssh.connect-timeout,30000), (file.blocksize,67108864), (file.bytes-per-checksum,512), (file.client-write-packet-size,65536), (file.replication,1), (file.stream-buffer-size,4096), (fs.AbstractFileSystem.abfs.impl,org.apache.hadoop.fs.azurebfs.Abfs), (fs.AbstractFileSystem.abfss.impl,org.apache.hadoop.fs.azurebfs.Abfss), (fs.AbstractFileSystem.adl.impl,org.apache.hadoop.fs.adl.Adl), (fs.AbstractFileSystem.azureml.impl,org.apache.hadoop.fs.azureml.Azureml), (fs.AbstractFileSystem.file.impl,org.apache.hadoop.fs.local.LocalFs), (fs.AbstractFileSystem.ftp.impl,org.apache.hadoop.fs.ftp.FtpFs), (fs.AbstractFileSystem.gs.impl,com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS), (fs.AbstractFileSystem.har.impl,org.apache.hadoop.fs.HarFs), (fs.AbstractFileSystem.hdfs.impl,org.apache.hadoop.fs.Hdfs), (fs.AbstractFileSystem.s3a.impl,org.apache.hadoop.fs.s3a.S3A), (fs.AbstractFileSystem.swebhdfs.impl,org.apache.hadoop.fs.SWebHdfs), (fs.AbstractFileSystem.synfs.impl,org.apache.hadoop.fs.synfs.Synfs), (fs.AbstractFileSystem.viewfs.impl,org.apache.hadoop.fs.viewfs.ViewFs), (fs.AbstractFileSystem.wasb.impl,org.apache.hadoop.fs.azure.Wasb), (fs.AbstractFileSystem.wasbs.impl,org.apache.hadoop.fs.azure.Wasbs), (fs.AbstractFileSystem.webhdfs.impl,org.apache.hadoop.fs.WebHdfs), (fs.abfs.impl,org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem), (fs.abfss.impl,org.apache.hadoop.fs.azurebfs.SecureAzureBlobFileSystem), (fs.adl.impl,org.apache.hadoop.fs.adl.AdlFileSystem), (fs.adl.oauth2.access.token.provider,com.microsoft.azure.synapse.tokenlibrary.AzureMLTokenBasedTokenProviderGen1), (fs.adl.oauth2.access.token.provider.type,Custom), (fs.automatic.close,true), (fs.azure.account.auth.type,Custom), (fs.azure.account.key.dsmlstoragexfsyt.blob.core.windows.net,GUuca/wqxyQYAM/dG6so9ZyQW0yybwv11nbXCFCUwqwNb74cOjTZ8eogG94l4GoZTVVONiQDFV/WToXq+E0T2A==), (fs.azure.account.oauth.provider.type,com.microsoft.azure.synapse.tokenlibrary.AzureMLTokenBasedTokenProviderGen2), (fs.azure.authorization,false), (fs.azure.authorization.caching.enable,true), (fs.azure.block.blob.with.compaction.dir,wasbs://9e2f8fd9-9d5f-4acd-99b5-3885490a4d31@hobostoragenue9ivxr1n.blob.core.windows.net/events/291/eventLogs/), (fs.azure.buffer.dir,${hadoop.tmp.dir}/abfs), (fs.azure.client.correlationid,ffa29c45-bec0-4198-9c33-2915e199aa93), (fs.azure.enable.append.support,true), (fs.azure.enable.readahead,true), (fs.azure.io.read.tolerate.concurrent.append,true), (fs.azure.io.retry.max.backoff.interval,20000), (fs.azure.io.retry.max.retries,19), (fs.azure.io.retry.mode,fulljitter), (fs.azure.local.sas.key.mode,false), (fs.azure.modification.time.millis.enabled,false), (fs.azure.sas.9e2f8fd9-9d5f-4acd-99b5-3885490a4d31.hobostoragenue9ivxr1n.blob.core.windows.net,*********(redacted)), (fs.azure.sas.azureml-blobstore-25d827f3-cf10-4e2a-b65d-40c316812ddd.dsmlstoragexfsyt.blob.core.windows.net,*********(redacted)), (fs.azure.sas.expiry.period,*********(redacted)), (fs.azure.saskey.usecontainersaskeyforallaccess,*********(redacted)), (fs.azure.secure.mode,false), (fs.azure.trident.always.use.http,false), (fs.azure.user.agent.prefix,User-Agent: APN/1.0 Azure Synapse Analytics/Spark/), (fs.azureml.impl,org.apache.hadoop.fs.azureml.AzureMLFileSystem), (fs.azureml.impl.disable.cache,true), (fs.client.resolve.remote.symlinks,true), (fs.client.resolve.topology.enabled,false), (fs.defaultFS,wasbs://9e2f8fd9-9d5f-4acd-99b5-3885490a4d31@hobostoragenue9ivxr1n.blob.core.windows.net), (fs.df.interval,60000), (fs.du.interval,600000), (fs.ftp.data.connection.mode,ACTIVE_LOCAL_DATA_CONNECTION_MODE), (fs.ftp.host,0.0.0.0), (fs.ftp.host.port,21), (fs.ftp.impl,org.apache.hadoop.fs.ftp.FTPFileSystem), (fs.ftp.timeout,0), (fs.ftp.transfer.mode,BLOCK_TRANSFER_MODE), (fs.getspaceused.jitterMillis,60000), (fs.har.impl.disable.cache,true), (fs.permissions.umask-mode,022), (fs.s3a.accesspoint.required,false), (fs.s3a.assumed.role.credentials.provider,org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider), (fs.s3a.assumed.role.session.duration,30m), (fs.s3a.attempts.maximum,20), (fs.s3a.aws.credentials.provider, org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider, org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider, com.amazonaws.auth.EnvironmentVariableCredentialsProvider, org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider ), (fs.s3a.block.size,32M), (fs.s3a.buffer.dir,${hadoop.tmp.dir}/s3a), (fs.s3a.change.detection.mode,server), (fs.s3a.change.detection.source,etag), (fs.s3a.change.detection.version.required,true), (fs.s3a.committer.abort.pending.uploads,true), (fs.s3a.committer.magic.enabled,true), (fs.s3a.committer.name,file), (fs.s3a.committer.staging.conflict-mode,append), (fs.s3a.committer.staging.tmp.path,tmp/staging), (fs.s3a.committer.staging.unique-filenames,true), (fs.s3a.committer.threads,8), (fs.s3a.connection.establish.timeout,5000), (fs.s3a.connection.maximum,96), (fs.s3a.connection.request.timeout,0), (fs.s3a.connection.ssl.enabled,true), (fs.s3a.connection.timeout,200000), (fs.s3a.downgrade.syncable.exceptions,true), (fs.s3a.endpoint,s3.amazonaws.com), (fs.s3a.etag.checksum.enabled,false), (fs.s3a.executor.capacity,16), (fs.s3a.fast.upload.active.blocks,4), (fs.s3a.fast.upload.buffer,disk), (fs.s3a.impl,org.apache.hadoop.fs.s3a.S3AFileSystem), (fs.s3a.list.version,2), (fs.s3a.max.total.tasks,32), (fs.s3a.metadatastore.authoritative,false), (fs.s3a.metadatastore.fail.on.write.error,true), (fs.s3a.metadatastore.impl,org.apache.hadoop.fs.s3a.s3guard.NullMetadataStore), (fs.s3a.metadatastore.metadata.ttl,15m), (fs.s3a.multiobjectdelete.enable,true), (fs.s3a.multipart.purge,false), (fs.s3a.multipart.purge.age,86400), (fs.s3a.multipart.size,64M), (fs.s3a.multipart.threshold,128M), (fs.s3a.paging.maximum,5000), (fs.s3a.path.style.access,false), (fs.s3a.readahead.range,64K), (fs.s3a.retry.interval,500ms), (fs.s3a.retry.limit,7), (fs.s3a.retry.throttle.interval,100ms), (fs.s3a.retry.throttle.limit,20), (fs.s3a.s3guard.cli.prune.age,86400000), (fs.s3a.s3guard.consistency.retry.interval,2s), (fs.s3a.s3guard.consistency.retry.limit,7), (fs.s3a.s3guard.ddb.background.sleep,25ms), (fs.s3a.s3guard.ddb.max.retries,9), (fs.s3a.s3guard.ddb.table.capacity.read,0), (fs.s3a.s3guard.ddb.table.capacity.write,0), (fs.s3a.s3guard.ddb.table.create,false), (fs.s3a.s3guard.ddb.table.sse.enabled,false), (fs.s3a.s3guard.ddb.throttle.retry.interval,100ms), (fs.s3a.select.enabled,true), (fs.s3a.select.errors.include.sql,false), (fs.s3a.select.input.compression,none), (fs.s3a.select.input.csv.comment.marker,#), (fs.s3a.select.input.csv.field.delimiter,,), (fs.s3a.select.input.csv.header,none), (fs.s3a.select.input.csv.quote.character,"), (fs.s3a.select.input.csv.quote.escape.character,\\), (fs.s3a.select.input.csv.record.delimiter,\n), (fs.s3a.select.output.csv.field.delimiter,,), (fs.s3a.select.output.csv.quote.character,"), (fs.s3a.select.output.csv.quote.escape.character,\\), (fs.s3a.select.output.csv.quote.fields,always), (fs.s3a.select.output.csv.record.delimiter,\n), (fs.s3a.socket.recv.buffer,8192), (fs.s3a.socket.send.buffer,8192), (fs.s3a.ssl.channel.mode,default_jsse), (fs.s3a.threads.keepalivetime,60), (fs.s3a.threads.max,64), (fs.swift.impl,org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem), (fs.synfs.impl,org.apache.hadoop.fs.synfs.SynapseFileSystem), (fs.trash.checkpoint.interval,0), (fs.trash.interval,0), (fs.viewfs.overload.scheme.target.abfs.impl,org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem), (fs.viewfs.overload.scheme.target.abfss.impl,org.apache.hadoop.fs.azurebfs.SecureAzureBlobFileSystem), (fs.viewfs.overload.scheme.target.file.impl,org.apache.hadoop.fs.LocalFileSystem), (fs.viewfs.overload.scheme.target.ftp.impl,org.apache.hadoop.fs.ftp.FTPFileSystem), (fs.viewfs.overload.scheme.target.gs.impl,com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS), (fs.viewfs.overload.scheme.target.hdfs.impl,org.apache.hadoop.hdfs.DistributedFileSystem), (fs.viewfs.overload.scheme.target.http.impl,org.apache.hadoop.fs.http.HttpFileSystem), (fs.viewfs.overload.scheme.target.https.impl,org.apache.hadoop.fs.http.HttpsFileSystem), (fs.viewfs.overload.scheme.target.o3fs.impl,org.apache.hadoop.fs.ozone.OzoneFileSystem), (fs.viewfs.overload.scheme.target.ofs.impl,org.apache.hadoop.fs.ozone.RootedOzoneFileSystem), (fs.viewfs.overload.scheme.target.oss.impl,org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem), (fs.viewfs.overload.scheme.target.s3a.impl,org.apache.hadoop.fs.s3a.S3AFileSystem), (fs.viewfs.overload.scheme.target.swebhdfs.impl,org.apache.hadoop.hdfs.web.SWebHdfsFileSystem), (fs.viewfs.overload.scheme.target.swift.impl,org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem), (fs.viewfs.overload.scheme.target.wasb.impl,org.apache.hadoop.fs.azure.NativeAzureFileSystem), (fs.viewfs.overload.scheme.target.webhdfs.impl,org.apache.hadoop.hdfs.web.WebHdfsFileSystem), (fs.viewfs.rename.strategy,SAME_MOUNTPOINT), (fs.wasb.impl,org.apache.hadoop.fs.azure.NativeAzureFileSystem), (fs.wasbs.impl,org.apache.hadoop.fs.azure.NativeAzureFileSystem$Secure), (ftp.blocksize,67108864), (ftp.bytes-per-checksum,512), (ftp.client-write-packet-size,65536), (ftp.replication,3), (ftp.stream-buffer-size,4096), (ha.failover-controller.active-standby-elector.zk.op.retries,3), (ha.failover-controller.cli-check.rpc-timeout.ms,20000), (ha.failover-controller.graceful-fence.connection.retries,1), (ha.failover-controller.graceful-fence.rpc-timeout.ms,5000), (ha.failover-controller.new-active.rpc-timeout.ms,60000), (ha.health-monitor.check-interval.ms,1000), (ha.health-monitor.connect-retry-interval.ms,1000), (ha.health-monitor.rpc-timeout.ms,45000), (ha.health-monitor.rpc.connect.max.retries,1), (ha.health-monitor.sleep-after-disconnect.ms,1000), (ha.zookeeper.acl,world:anyone:rwcda), (ha.zookeeper.parent-znode,/hadoop-ha), (ha.zookeeper.session-timeout.ms,10000), (hadoop.caller.context.enabled,false), (hadoop.caller.context.max.size,128), (hadoop.caller.context.signature.max.size,40), (hadoop.common.configuration.version,3.0.0), (hadoop.domainname.resolver.impl,org.apache.hadoop.net.DNSDomainNameResolver), (hadoop.http.authentication.kerberos.keytab,${user.home}/hadoop.keytab), (hadoop.http.authentication.kerberos.principal,HTTP/_HOST@LOCALHOST), (hadoop.http.authentication.signature.secret.file,*********(redacted)), (hadoop.http.authentication.simple.anonymous.allowed,true), (hadoop.http.authentication.token.validity,36000), (hadoop.http.authentication.type,simple), (hadoop.http.cross-origin.allowed-headers,X-Requested-With,Content-Type,Accept,Origin), (hadoop.http.cross-origin.allowed-methods,GET,POST,HEAD), (hadoop.http.cross-origin.allowed-origins,*), (hadoop.http.cross-origin.enabled,false), (hadoop.http.cross-origin.max-age,1800), (hadoop.http.filter.initializers,org.apache.hadoop.http.lib.StaticUserWebFilter), (hadoop.http.idle_timeout.ms,60000), (hadoop.http.logs.enabled,true), (hadoop.http.sni.host.check.enabled,false), (hadoop.http.staticuser.user,dr.who), (hadoop.jetty.logs.serve.aliases,true), (hadoop.kerberos.keytab.login.autorenewal.enabled,false), (hadoop.kerberos.kinit.command,kinit), (hadoop.kerberos.min.seconds.before.relogin,60), (hadoop.metrics.jvm.use-thread-mxbean,false), (hadoop.prometheus.endpoint.enabled,false), (hadoop.registry.jaas.context,Client), (hadoop.registry.secure,false), (hadoop.registry.system.acls,sasl:yarn@, sasl:mapred@, sasl:hdfs@), (hadoop.registry.zk.connection.timeout.ms,15000), (hadoop.registry.zk.quorum,localhost:2181), (hadoop.registry.zk.retry.ceiling.ms,60000), (hadoop.registry.zk.retry.interval.ms,1000), (hadoop.registry.zk.retry.times,5), (hadoop.registry.zk.root,/registry), (hadoop.registry.zk.session.timeout.ms,60000), (hadoop.rpc.protection,authentication), (hadoop.rpc.socket.factory.class.default,org.apache.hadoop.net.StandardSocketFactory), (hadoop.security.auth_to_local.mechanism,hadoop), (hadoop.security.authentication,simple), (hadoop.security.authorization,false), (hadoop.security.credential.clear-text-fallback,true), (hadoop.security.crypto.buffer.size,8192), (hadoop.security.crypto.cipher.suite,AES/CTR/NoPadding), (hadoop.security.crypto.codec.classes.aes.ctr.nopadding,org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec, org.apache.hadoop.crypto.JceAesCtrCryptoCodec), (hadoop.security.dns.log-slow-lookups.enabled,false), (hadoop.security.dns.log-slow-lookups.threshold.ms,1000), (hadoop.security.group.mapping,org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback), (hadoop.security.group.mapping.ldap.connection.timeout.ms,60000), (hadoop.security.group.mapping.ldap.conversion.rule,none), (hadoop.security.group.mapping.ldap.directory.search.timeout,10000), (hadoop.security.group.mapping.ldap.num.attempts,3), (hadoop.security.group.mapping.ldap.num.attempts.before.failover,3), (hadoop.security.group.mapping.ldap.posix.attr.gid.name,gidNumber), (hadoop.security.group.mapping.ldap.posix.attr.uid.name,uidNumber), (hadoop.security.group.mapping.ldap.read.timeout.ms,60000), (hadoop.security.group.mapping.ldap.search.attr.group.name,cn), (hadoop.security.group.mapping.ldap.search.attr.member,member), (hadoop.security.group.mapping.ldap.search.filter.group,(objectClass=group)), (hadoop.security.group.mapping.ldap.search.filter.user,(&(objectClass=user)(sAMAccountName={0}))), (hadoop.security.group.mapping.ldap.search.group.hierarchy.levels,0), (hadoop.security.group.mapping.ldap.ssl,false), (hadoop.security.group.mapping.providers.combined,true), (hadoop.security.groups.cache.background.reload,false), (hadoop.security.groups.cache.background.reload.threads,3), (hadoop.security.groups.cache.secs,300), (hadoop.security.groups.cache.warn.after.ms,5000), (hadoop.security.groups.negative-cache.secs,30), (hadoop.security.groups.shell.command.timeout,0s), (hadoop.security.instrumentation.requires.admin,false), (hadoop.security.java.secure.random.algorithm,SHA1PRNG), (hadoop.security.key.default.bitlength,128), (hadoop.security.key.default.cipher,AES/CTR/NoPadding), (hadoop.security.kms.client.authentication.retry-count,1), (hadoop.security.kms.client.encrypted.key.cache.expiry,43200000), (hadoop.security.kms.client.encrypted.key.cache.low-watermark,0.3f), (hadoop.security.kms.client.encrypted.key.cache.num.refill.threads,2), (hadoop.security.kms.client.encrypted.key.cache.size,500), (hadoop.security.kms.client.failover.sleep.base.millis,100), (hadoop.security.kms.client.failover.sleep.max.millis,2000), (hadoop.security.kms.client.timeout,60), (hadoop.security.random.device.file.path,/dev/urandom), (hadoop.security.secure.random.impl,org.apache.hadoop.crypto.random.OpensslSecureRandom), (hadoop.security.sensitive-config-keys,*********(redacted)), (hadoop.security.token.service.use_ip,true), (hadoop.security.uid.cache.secs,14400), (hadoop.service.shutdown.timeout,30s), (hadoop.shell.missing.defaultFs.warning,false), (hadoop.shell.safely.delete.limit.num.files,100), (hadoop.ssl.client.conf,ssl-client.xml), (hadoop.ssl.enabled.protocols,TLSv1.2), (hadoop.ssl.hostname.verifier,DEFAULT), (hadoop.ssl.keystores.factory.class,org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory), (hadoop.ssl.require.client.cert,false), (hadoop.ssl.server.conf,ssl-server.xml), (hadoop.system.tags,YARN,HDFS,NAMENODE,DATANODE,REQUIRED,SECURITY,KERBEROS,PERFORMANCE,CLIENT ,SERVER,DEBUG,DEPRECATED,COMMON,OPTIONAL), (hadoop.tags.system,YARN,HDFS,NAMENODE,DATANODE,REQUIRED,SECURITY,KERBEROS,PERFORMANCE,CLIENT ,SERVER,DEBUG,DEPRECATED,COMMON,OPTIONAL), (hadoop.tmp.dir,/mnt/var/hadoop/tmp), (hadoop.user.group.static.mapping.overrides,dr.who=;), (hadoop.util.hash.type,murmur), (hadoop.workaround.non.threadsafe.getpwuid,true), (hadoop.zk.acl,world:anyone:rwcda), (hadoop.zk.address,vm-58f13156:2181,vm-b0b35166:2181,vm-14223739:2181), (hadoop.zk.num-retries,1000), (hadoop.zk.retry-interval-ms,1000), (hadoop.zk.timeout-ms,10000), (hive.exec.scratchdir,wasbs://9e2f8fd9-9d5f-4acd-99b5-3885490a4d31@hobostoragenue9ivxr1n.blob.core.windows.net/tmp/hive), (hive.imetastoreclient.factory.class,com.microsoft.catalog.metastore.metastoreclient.HiveMetastoreClientFactory), (hive.metastore.warehouse.dir,wasbs://9e2f8fd9-9d5f-4acd-99b5-3885490a4d31@hobostoragenue9ivxr1n.blob.core.windows.net/synapse/workspaces/25d827f3-cf10-4e2a-b65d-40c316812ddd/warehouse), (io.bytes.per.checksum,512), (io.compression.codec.bzip2.library,system-native), (io.erasurecode.codec.rs-legacy.rawcoders,rs-legacy_java), (io.erasurecode.codec.rs.rawcoders,rs_native,rs_java), (io.erasurecode.codec.xor.rawcoders,xor_native,xor_java), (io.file.buffer.size,65536), (io.map.index.interval,128), (io.map.index.skip,0), (io.mapfile.bloom.error.rate,0.005), (io.mapfile.bloom.size,1048576), (io.seqfile.compress.blocksize,1000000), (io.seqfile.local.dir,${hadoop.tmp.dir}/io/local), (io.serializations,org.apache.hadoop.io.serializer.WritableSerialization, org.apache.hadoop.io.serializer.avro.AvroSpecificSerialization, org.apache.hadoop.io.serializer.avro.AvroReflectSerialization), (io.skip.checksum.errors,false), (ipc.[port_number].backoff.enable,false), (ipc.[port_number].callqueue.impl,java.util.concurrent.LinkedBlockingQueue), (ipc.[port_number].cost-provider.impl,org.apache.hadoop.ipc.DefaultCostProvider), (ipc.[port_number].decay-scheduler.backoff.responsetime.enable,false), (ipc.[port_number].decay-scheduler.backoff.responsetime.thresholds,10s,20s,30s,40s), (ipc.[port_number].decay-scheduler.decay-factor,0.5), (ipc.[port_number].decay-scheduler.metrics.top.user.count,10), (ipc.[port_number].decay-scheduler.period-ms,5000), (ipc.[port_number].decay-scheduler.thresholds,13,25,50), (ipc.[port_number].faircallqueue.multiplexer.weights,8,4,2,1), (ipc.[port_number].identity-provider.impl,org.apache.hadoop.ipc.UserIdentityProvider), (ipc.[port_number].scheduler.impl,org.apache.hadoop.ipc.DefaultRpcScheduler), (ipc.[port_number].scheduler.priority.levels,4), (ipc.[port_number].weighted-cost.handler,1), (ipc.[port_number].weighted-cost.lockexclusive,100), (ipc.[port_number].weighted-cost.lockfree,1), (ipc.[port_number].weighted-cost.lockshared,10), (ipc.[port_number].weighted-cost.response,1), (ipc.client.bind.wildcard.addr,false), (ipc.client.connect.max.retries,10), (ipc.client.connect.max.retries.on.timeouts,45), (ipc.client.connect.retry.interval,1000), (ipc.client.connect.timeout,20000), (ipc.client.connection.maxidletime,10000), (ipc.client.fallback-to-simple-auth-allowed,false), (ipc.client.idlethreshold,4000), (ipc.client.kill.max,10), (ipc.client.low-latency,false), (ipc.client.ping,true), (ipc.client.rpc-timeout.ms,0), (ipc.client.tcpnodelay,true), (ipc.maximum.data.length,134217728), (ipc.maximum.response.length,134217728), (ipc.ping.interval,60000), (ipc.server.listen.queue.size,256), (ipc.server.log.slow.rpc,false), (ipc.server.max.connections,0), (ipc.server.purge.interval,15), (ipc.server.reuseaddr,true), (javax.jdo.option.ConnectionDriverName,com.microsoft.sqlserver.jdbc.SQLServerDriver), (javax.jdo.option.ConnectionPassword,*********(redacted)), (javax.jdo.option.ConnectionURL,;), (javax.jdo.option.ConnectionUserName,), (map.sort.class,org.apache.hadoop.util.QuickSort), (mapreduce.am.max-attempts,2), (mapreduce.app-submission.cross-platform,false), (mapreduce.client.completion.pollinterval,5000), (mapreduce.client.libjars.wildcard,true), (mapreduce.client.output.filter,FAILED), (mapreduce.client.progressmonitor.pollinterval,1000), (mapreduce.client.submit.file.replication,10), (mapreduce.cluster.acls.enabled,false), (mapreduce.cluster.local.dir,${hadoop.tmp.dir}/mapred/local), (mapreduce.fileoutputcommitter.algorithm.version,2), (mapreduce.fileoutputcommitter.task.cleanup.enabled,false), (mapreduce.framework.name,yarn), (mapreduce.ifile.readahead,true), (mapreduce.ifile.readahead.bytes,4194304), (mapreduce.input.fileinputformat.list-status.num-threads,1), (mapreduce.input.fileinputformat.split.minsize,0), (mapreduce.input.lineinputformat.linespermap,1), (mapreduce.job.acl-modify-job, ), (mapreduce.job.acl-view-job, ), (mapreduce.job.cache.limit.max-resources,0), (mapreduce.job.cache.limit.max-resources-mb,0), (mapreduce.job.cache.limit.max-single-resource-mb,0), (mapreduce.job.classloader,false), (mapreduce.job.committer.setup.cleanup.needed,true), (mapreduce.job.complete.cancel.delegation.tokens,true), (mapreduce.job.counters.max,120), (mapreduce.job.dfs.storage.capacity.kill-limit-exceed,false), (mapreduce.job.emit-timeline-data,false), (mapreduce.job.encrypted-intermediate-data,false), (mapreduce.job.encrypted-intermediate-data-key-size-bits,128), (mapreduce.job.encrypted-intermediate-data.buffer.kb,128), (mapreduce.job.end-notification.max.attempts,5), (mapreduce.job.end-notification.max.retry.interval,5000), (mapreduce.job.end-notification.retry.attempts,0), (mapreduce.job.end-notification.retry.interval,1000), (mapreduce.job.finish-when-all-reducers-done,true), (mapreduce.job.hdfs-servers,${fs.defaultFS}), (mapreduce.job.heap.memory-mb.ratio,0.8), (mapreduce.job.local-fs.single-disk-limit.bytes,-1), (mapreduce.job.local-fs.single-disk-limit.check.interval-ms,5000), (mapreduce.job.local-fs.single-disk-limit.check.kill-limit-exceed,true), (mapreduce.job.map.output.collector.class,org.apache.hadoop.mapred.MapTask$MapOutputBuffer), (mapreduce.job.maps,2), (mapreduce.job.max.map,-1), (mapreduce.job.max.split.locations,15), (mapreduce.job.maxtaskfailures.per.tracker,3), (mapreduce.job.queuename,default), (mapreduce.job.reduce.shuffle.consumer.plugin.class,org.apache.hadoop.mapreduce.task.reduce.Shuffle), (mapreduce.job.reduce.slowstart.completedmaps,0.05), (mapreduce.job.reducer.preempt.delay.sec,0), (mapreduce.job.reducer.unconditional-preempt.delay.sec,300), (mapreduce.job.reduces,1), (mapreduce.job.running.map.limit,0), (mapreduce.job.running.reduce.limit,0), (mapreduce.job.sharedcache.mode,disabled), (mapreduce.job.speculative.minimum-allowed-tasks,10), (mapreduce.job.speculative.retry-after-no-speculate,1000), (mapreduce.job.speculative.retry-after-speculate,15000), (mapreduce.job.speculative.slowtaskthreshold,1.0), (mapreduce.job.speculative.speculative-cap-running-tasks,0.1), (mapreduce.job.speculative.speculative-cap-total-tasks,0.01), (mapreduce.job.split.metainfo.maxsize,10000000), (mapreduce.job.token.tracking.ids.enabled,false), (mapreduce.job.ubertask.enable,false), (mapreduce.job.ubertask.maxmaps,9), (mapreduce.job.ubertask.maxreduces,1), (mapreduce.jobhistory.address,0.0.0.0:10020), (mapreduce.jobhistory.admin.acl,*), (mapreduce.jobhistory.admin.address,0.0.0.0:10033), (mapreduce.jobhistory.always-scan-user-dir,false), (mapreduce.jobhistory.cleaner.enable,true), (mapreduce.jobhistory.cleaner.interval-ms,86400000), (mapreduce.jobhistory.client.thread-count,10), (mapreduce.jobhistory.datestring.cache.size,200000), (mapreduce.jobhistory.done-dir,${yarn.app.mapreduce.am.staging-dir}/history/done), (mapreduce.jobhistory.http.policy,HTTP_ONLY), (mapreduce.jobhistory.intermediate-done-dir,${yarn.app.mapreduce.am.staging-dir}/history/done_intermediate), (mapreduce.jobhistory.intermediate-user-done-dir.permissions,770), (mapreduce.jobhistory.jhist.format,binary), (mapreduce.jobhistory.joblist.cache.size,20000), (mapreduce.jobhistory.jobname.limit,50), (mapreduce.jobhistory.keytab,/etc/security/keytab/jhs.service.keytab), (mapreduce.jobhistory.loadedjob.tasks.max,-1), (mapreduce.jobhistory.loadedjobs.cache.size,5), (mapreduce.jobhistory.max-age-ms,604800000), (mapreduce.jobhistory.minicluster.fixed.ports,false), (mapreduce.jobhistory.move.interval-ms,180000), (mapreduce.jobhistory.move.thread-count,3), (mapreduce.jobhistory.principal,jhs/_HOST@REALM.TLD), (mapreduce.jobhistory.recovery.enable,false), (mapreduce.jobhistory.recovery.store.class,org.apache.hadoop.mapreduce.v2.hs.HistoryServerFileSystemStateStoreService), (mapreduce.jobhistory.recovery.store.fs.uri,${hadoop.tmp.dir}/mapred/history/recoverystore), (mapreduce.jobhistory.recovery.store.leveldb.path,${hadoop.tmp.dir}/mapred/history/recoverystore), (mapreduce.jobhistory.webapp.address,0.0.0.0:19888), (mapreduce.jobhistory.webapp.https.address,0.0.0.0:19890), (mapreduce.jobhistory.webapp.rest-csrf.custom-header,X-XSRF-Header), (mapreduce.jobhistory.webapp.rest-csrf.enabled,false), (mapreduce.jobhistory.webapp.rest-csrf.methods-to-ignore,GET,OPTIONS,HEAD), (mapreduce.jobhistory.webapp.xfs-filter.xframe-options,SAMEORIGIN), (mapreduce.jvm.system-properties-to-log,os.name,os.version,java.home,java.runtime.version,java.vendor,java.version,java.vm.name,java.class.path,java.io.tmpdir,user.dir,user.name), (mapreduce.map.cpu.vcores,1), (mapreduce.map.env,HADOOP_MAPRED_HOME=${HADOOP_HOME}), (mapreduce.map.java.opts,-Xmx2560M -Xms2560M -Djava.net.preferIPv4Stack=true -XX:NewRatio=8 -XX:+UseNUMA -XX:+UseParallelGC), (mapreduce.map.log.level,INFO), (mapreduce.map.maxattempts,4), (mapreduce.map.memory.mb,3072), (mapreduce.map.output.compress,false), (mapreduce.map.output.compress.codec,org.apache.hadoop.io.compress.DefaultCodec), (mapreduce.map.skip.maxrecords,0), (mapreduce.map.skip.proc-count.auto-incr,true), (mapreduce.map.sort.spill.percent,0.80), (mapreduce.map.speculative,true), (mapreduce.output.fileoutputformat.compress,false), (mapreduce.output.fileoutputformat.compress.codec,org.apache.hadoop.io.compress.DefaultCodec), (mapreduce.output.fileoutputformat.compress.type,RECORD), (mapreduce.outputcommitter.factory.scheme.s3a,org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory), (mapreduce.reduce.cpu.vcores,1), (mapreduce.reduce.env,HADOOP_MAPRED_HOME=${HADOOP_HOME}), (mapreduce.reduce.input.buffer.percent,0.0), (mapreduce.reduce.java.opts,-Xmx2560M -Xms2560M -Djava.net.preferIPv4Stack=true -XX:NewRatio=8 -XX:+UseNUMA -XX:+UseParallelGC), (mapreduce.reduce.log.level,INFO), (mapreduce.reduce.markreset.buffer.percent,0.0), (mapreduce.reduce.maxattempts,4), (mapreduce.reduce.memory.mb,3072), (mapreduce.reduce.merge.inmem.threshold,1000), (mapreduce.reduce.shuffle.connect.timeout,180000), (mapreduce.reduce.shuffle.fetch.retry.enabled,${yarn.nodemanager.recovery.enabled}), (mapreduce.reduce.shuffle.fetch.retry.interval-ms,1000), (mapreduce.reduce.shuffle.fetch.retry.timeout-ms,30000), (mapreduce.reduce.shuffle.input.buffer.percent,0.70), (mapreduce.reduce.shuffle.memory.limit.percent,0.25), (mapreduce.reduce.shuffle.merge.percent,0.66), (mapreduce.reduce.shuffle.parallelcopies,30), (mapreduce.reduce.shuffle.read.timeout,180000), (mapreduce.reduce.shuffle.retry-delay.max.ms,60000), (mapreduce.reduce.skip.maxgroups,0), (mapreduce.reduce.skip.proc-count.auto-incr,true), (mapreduce.reduce.speculative,true), (mapreduce.shuffle.connection-keep-alive.enable,false), (mapreduce.shuffle.connection-keep-alive.timeout,5), (mapreduce.shuffle.listen.queue.size,128), (mapreduce.shuffle.max.connections,0), (mapreduce.shuffle.max.threads,0), (mapreduce.shuffle.pathcache.concurrency-level,16), (mapreduce.shuffle.pathcache.expire-after-access-minutes,5), (mapreduce.shuffle.pathcache.max-weight,10485760), (mapreduce.shuffle.port,13562), (mapreduce.shuffle.ssl.enabled,false), (mapreduce.shuffle.ssl.file.buffer.size,65536), (mapreduce.shuffle.transfer.buffer.size,131072), (mapreduce.shuffle.transferTo.allowed,true), (mapreduce.task.combine.progress.records,10000), (mapreduce.task.exit.timeout,60000), (mapreduce.task.exit.timeout.check-interval-ms,20000), (mapreduce.task.files.preserve.failedtasks,false), (mapreduce.task.io.sort.factor,10), (mapreduce.task.io.sort.mb,1228), (mapreduce.task.local-fs.write-limit.bytes,-1), (mapreduce.task.merge.progress.records,10000), (mapreduce.task.profile,false), (mapreduce.task.profile.map.params,${mapreduce.task.profile.params}), (mapreduce.task.profile.maps,0-2), (mapreduce.task.profile.params,-agentlib:hprof=cpu=samples,heap=sites,force=n,thread=y,verbose=n,file=%s), (mapreduce.task.profile.reduce.params,${mapreduce.task.profile.params}), (mapreduce.task.profile.reduces,0-2), (mapreduce.task.skip.start.attempts,2), (mapreduce.task.stuck.timeout-ms,600000), (mapreduce.task.timeout,600000), (mapreduce.task.userlog.limit.kb,0), (net.topology.impl,org.apache.hadoop.net.NetworkTopology), (net.topology.node.switch.mapping.impl,org.apache.hadoop.net.ScriptBasedMapping), (net.topology.script.number.args,100), (nfs.exports.allowed.hosts,* rw), (rpc.metrics.quantile.enable,false), (rpc.metrics.timeunit,MILLISECONDS), (seq.io.sort.factor,100), (seq.io.sort.mb,100), (spark.cluster.type,aml), (spark.shuffle.service.resolver.class,org.apache.spark.network.yarn.ShuffleMovementAwareExternalShuffleBlockResolver), (synapse.vfs.acceptedThreadNames,Executor task launch), (synapse.vfs.debug.log.level,3), (synapse.vfs.disabled.extensions,.csv), (synapse.vfs.enabled,true), (synapse.vfs.enabled.extensions,.parquet), (tfile.fs.input.buffer.size,262144), (tfile.fs.output.buffer.size,262144), (tfile.io.chunk.size,1048576), (yarn.acl.enable,false), (yarn.acl.reservation-enable,false), (yarn.admin.acl,*), (yarn.am.liveness-monitor.expiry-interval-ms,600000), (yarn.app.application-master.rolling-logs.appender.enable,true), (yarn.app.application-master.rolling-logs.job-group-appender.enable,true), (yarn.app.application-master.rolling-logs.rolling-appender.enable,true), (yarn.app.attempt.diagnostics.limit.kc,64), (yarn.app.mapreduce.am.command-opts,-Xmx2560M -Xms2560M -Djava.net.preferIPv4Stack=true -XX:NewRatio=8 -XX:+UseNUMA -XX:+UseParallelGC), (yarn.app.mapreduce.am.container.log.backups,0), (yarn.app.mapreduce.am.container.log.limit.kb,0), (yarn.app.mapreduce.am.containerlauncher.threadpool-initial-size,10), (yarn.app.mapreduce.am.env,HADOOP_MAPRED_HOME=${HADOOP_HOME}), (yarn.app.mapreduce.am.hard-kill-timeout-ms,10000), (yarn.app.mapreduce.am.job.client.port-range,53000-55000), (yarn.app.mapreduce.am.job.committer.cancel-timeout,60000), (yarn.app.mapreduce.am.job.committer.commit-window,10000), (yarn.app.mapreduce.am.job.task.listener.thread-count,30), (yarn.app.mapreduce.am.log.level,INFO), (yarn.app.mapreduce.am.resource.cpu-vcores,1), (yarn.app.mapreduce.am.resource.mb,3072), (yarn.app.mapreduce.am.scheduler.heartbeat.interval-ms,1000), (yarn.app.mapreduce.am.staging-dir,/tmp/hadoop-yarn/staging), (yarn.app.mapreduce.am.staging-dir.erasurecoding.enabled,false), (yarn.app.mapreduce.am.webapp.https.client.auth,false), (yarn.app.mapreduce.am.webapp.https.enabled,false), (yarn.app.mapreduce.client-am.ipc.max-retries,3), (yarn.app.mapreduce.client-am.ipc.max-retries-on-timeouts,3), (yarn.app.mapreduce.client.job.max-retries,3), (yarn.app.mapreduce.client.job.retry-interval,2000), (yarn.app.mapreduce.client.max-retries,3), (yarn.app.mapreduce.shuffle.log.backups,0), (yarn.app.mapreduce.shuffle.log.limit.kb,0), (yarn.app.mapreduce.shuffle.log.separate,true), (yarn.app.mapreduce.task.container.log.backups,0), (yarn.application.classpath,$HADOOP_CONF_DIR,/usr/hdp/current/hadoop-client/*,/usr/hdp/current/hadoop-client/lib/*,/usr/hdp/current/hadoop-hdfs-client/*,/usr/hdp/current/hadoop-hdfs-client/lib/*,/usr/hdp/current/hadoop-yarn-client/*,/usr/hdp/current/hadoop-yarn-client/lib/*), (yarn.client.application-client-protocol.poll-interval-ms,200), (yarn.client.application-client-protocol.poll-timeout-ms,-1), (yarn.client.failover-no-ha-proxy-provider,org.apache.hadoop.yarn.client.DefaultNoHARMFailoverProxyProvider), (yarn.client.failover-proxy-provider,org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider), (yarn.client.failover-retries,0), (yarn.client.failover-retries-on-socket-timeouts,0), (yarn.client.failover-sleep-base-ms,5000), (yarn.client.failover-sleep-max-ms,5000), (yarn.client.load.resource-types.from-server,false), (yarn.client.max-cached-nodemanagers-proxies,0), (yarn.client.nodemanager-client-async.thread-pool-max-size,500), (yarn.client.nodemanager-connect.max-wait-ms,60000), (yarn.client.nodemanager-connect.retry-interval-ms,2000), (yarn.cluster.max-application-priority,0), (yarn.dispatcher.cpu-monitor.samples-per-min,60), (yarn.dispatcher.drain-events.timeout,300000), (yarn.dispatcher.print-events-info.threshold,5000), (yarn.fail-fast,false), (yarn.federation.cache-ttl.secs,300), (yarn.federation.enabled,false), (yarn.federation.registry.base-dir,yarnfederation/), (yarn.federation.state-store.class,org.apache.hadoop.yarn.server.federation.store.impl.MemoryFederationStateStore), (yarn.federation.subcluster-resolver.class,org.apache.hadoop.yarn.server.federation.resolver.DefaultSubClusterResolverImpl), (yarn.http.policy,HTTP_ONLY), (yarn.intermediate-data-encryption.enable,false), (yarn.ipc.rpc.class,org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC), (yarn.is.minicluster,false), (yarn.log-aggregation-enable,true), (yarn.log-aggregation-status.time-out.ms,600000), (yarn.log-aggregation.IndexedFormat.remote-app-log-dir-suffix,logs-ifile), (yarn.log-aggregation.debug.filesize,104857600), (yarn.log-aggregation.file-controller.IndexedFormat.class,org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController), (yarn.log-aggregation.file-controller.TFile.class,org.apache.hadoop.yarn.logaggregation.filecontroller.tfile.LogAggregationTFileController), (yarn.log-aggregation.file-formats,IndexedFormat,TFile), (yarn.log-aggregation.remote-app-log-dir.create-later,true), (yarn.log-aggregation.retain-check-interval-seconds,-1), (yarn.log-aggregation.retain-seconds,-1), (yarn.minicluster.control-resource-monitoring,false), (yarn.minicluster.fixed.ports,false), (yarn.minicluster.use-rpc,false), (yarn.minicluster.yarn.nodemanager.resource.memory-mb,4096), (yarn.nm.liveness-monitor.expiry-interval-ms,600000), (yarn.node-attribute.fs-store.impl.class,org.apache.hadoop.yarn.server.resourcemanager.nodelabels.FileSystemNodeAttributeStore), (yarn.node-labels.configuration-type,centralized), (yarn.node-labels.enabled,false), (yarn.node-labels.fs-store.impl.class,org.apache.hadoop.yarn.nodelabels.FileSystemNodeLabelsStore), (yarn.nodemanager.address,${yarn.nodemanager.hostname}:0), (yarn.nodemanager.admin-env,MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX), (yarn.nodemanager.amrmproxy.address,0.0.0.0:8049), (yarn.nodemanager.amrmproxy.client.thread-count,25), (yarn.nodemanager.amrmproxy.enabled,false), (yarn.nodemanager.amrmproxy.ha.enable,false), (yarn.nodemanager.amrmproxy.interceptor-class.pipeline,org.apache.hadoop.yarn.server.nodemanager.amrmproxy.DefaultRequestInterceptor), (yarn.nodemanager.aux-services,spark_shuffle), (yarn.nodemanager.aux-services.manifest.enabled,false), (yarn.nodemanager.aux-services.manifest.reload-ms,0), (yarn.nodemanager.aux-services.mapreduce_shuffle.class,org.apache.hadoop.mapred.ShuffleHandler), (yarn.nodemanager.aux-services.spark_shuffle.class,org.apache.spark.network.yarn.YarnShuffleService), (yarn.nodemanager.aux-services.spark_shuffle.classpath,/usr/hdp/current/spark3-client/aux/*), (yarn.nodemanager.aux-services.timeline_collector.class,org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService), (yarn.nodemanager.collector-service.address,${yarn.nodemanager.hostname}:8048), (yarn.nodemanager.collector-service.thread-count,5), (yarn.nodemanager.container-diagnostics-maximum-size,10000), (yarn.nodemanager.container-executor.class,org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor), (yarn.nodemanager.container-executor.exit-code-file.timeout-ms,2000), (yarn.nodemanager.container-localizer.java.opts,-Xmx256m), (yarn.nodemanager.container-localizer.log.level,INFO), (yarn.nodemanager.container-log-monitor.dir-size-limit-bytes,1000000000), (yarn.nodemanager.container-log-monitor.enable,false), (yarn.nodemanager.container-log-monitor.interval-ms,60000), (yarn.nodemanager.container-log-monitor.total-size-limit-bytes,10000000000), (yarn.nodemanager.container-manager.thread-count,20), (yarn.nodemanager.container-metrics.enable,true), (yarn.nodemanager.container-metrics.period-ms,-1), (yarn.nodemanager.container-metrics.unregister-delay-ms,10000), (yarn.nodemanager.container-monitor.enabled,true), (yarn.nodemanager.container-monitor.interval-ms,2000), (yarn.nodemanager.container-monitor.procfs-tree.smaps-based-rss.enabled,false), (yarn.nodemanager.container-retry-minimum-interval-ms,1000), (yarn.nodemanager.container.stderr.pattern,{*stderr*,*STDERR*}), (yarn.nodemanager.container.stderr.tail.bytes,4096), (yarn.nodemanager.containers-launcher.class,org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher), (yarn.nodemanager.default-container-executor.log-dirs.permissions,710), (yarn.nodemanager.delete.debug-delay-sec,0), (yarn.nodemanager.delete.thread-count,4), (yarn.nodemanager.disk-health-checker.disk-free-space-threshold.enabled,true), (yarn.nodemanager.disk-health-checker.disk-utilization-threshold.enabled,true), (yarn.nodemanager.disk-health-checker.enable,true), (yarn.nodemanager.disk-health-checker.interval-ms,120000), (yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage,90.0), (yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb,1000), (yarn.nodemanager.disk-health-checker.min-free-space-per-disk-watermark-high-mb,0), (yarn.nodemanager.disk-health-checker.min-healthy-disks,0.25), (yarn.nodemanager.disk-validator,basic), (yarn.nodemanager.distributed-scheduling.enabled,false), (yarn.nodemanager.elastic-memory-control.enabled,false), (yarn.nodemanager.elastic-memory-control.oom-handler,org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.DefaultOOMHandler), (yarn.nodemanager.elastic-memory-control.timeout-sec,5), (yarn.nodemanager.emit-container-events,true), (yarn.nodemanager.env-whitelist,JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_HOME,PATH,LANG,TZ), (yarn.nodemanager.health-checker.interval-ms,600000), (yarn.nodemanager.health-checker.run-before-startup,false), (yarn.nodemanager.health-checker.scripts,script), (yarn.nodemanager.health-checker.timeout-ms,1200000), (yarn.nodemanager.hostname,0.0.0.0), (yarn.nodemanager.keytab,/etc/krb5.keytab), (yarn.nodemanager.linux-container-executor.cgroups.delete-delay-ms,20), (yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms,1000), (yarn.nodemanager.linux-container-executor.cgroups.hierarchy,/hadoop-yarn), (yarn.nodemanager.linux-container-executor.cgroups.mount,false), (yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage,false), (yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users,true), (yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user,nobody), (yarn.nodemanager.linux-container-executor.nonsecure-mode.user-pattern,^[_.A-Za-z0-9][-@_.A-Za-z0-9]{0,255}?[$]?$), (yarn.nodemanager.linux-container-executor.resources-handler.class,org.apache.hadoop.yarn.server.nodemanager.util.DefaultLCEResourcesHandler), (yarn.nodemanager.local-cache.max-files-per-directory,8192), (yarn.nodemanager.local-dirs,${hadoop.tmp.dir}/nm-local-dir), (yarn.nodemanager.localizer.address,${yarn.nodemanager.hostname}:8040), (yarn.nodemanager.localizer.cache.cleanup.interval-ms,600000), (yarn.nodemanager.localizer.cache.target-size-mb,10240), (yarn.nodemanager.localizer.client.thread-count,5), (yarn.nodemanager.localizer.fetch.thread-count,4), (yarn.nodemanager.log-aggregation.compression-type,gz), (yarn.nodemanager.log-aggregation.debug-enabled,false), (yarn.nodemanager.log-aggregation.num-log-files-per-app,30), (yarn.nodemanager.log-aggregation.policy.class,org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AllContainerLogAggregationPolicy), (yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds,3600), (yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds.min,3600), (yarn.nodemanager.log-container-debug-info.enabled,true), (yarn.nodemanager.log-dirs,${yarn.log.dir}/userlogs), (yarn.nodemanager.log.deletion-threads-count,4), (yarn.nodemanager.log.retain-seconds,604800), (yarn.nodemanager.logaggregation.threadpool-size-max,100), (yarn.nodemanager.node-attributes.provider.fetch-interval-ms,600000), (yarn.nodemanager.node-attributes.provider.fetch-timeout-ms,1200000), (yarn.nodemanager.node-attributes.resync-interval-ms,120000), (yarn.nodemanager.node-labels.provider.fetch-interval-ms,600000), (yarn.nodemanager.node-labels.provider.fetch-timeout-ms,1200000), (yarn.nodemanager.node-labels.resync-interval-ms,120000), (yarn.nodemanager.numa-awareness.enabled,false), (yarn.nodemanager.numa-awareness.numactl.cmd,/usr/bin/numactl), (yarn.nodemanager.numa-awareness.read-topology,false), (yarn.nodemanager.opportunistic-containers-max-queue-length,0), (yarn.nodemanager.opportunistic-containers-use-pause-for-preemption,false), (yarn.nodemanager.pluggable-device-framework.enabled,false), (yarn.nodemanager.pmem-check-enabled,false), (yarn.nodemanager.process-kill-wait.ms,2000), (yarn.nodemanager.recovery.compaction-interval-secs,3600), (yarn.nodemanager.recovery.dir,${hadoop.tmp.dir}/yarn-nm-recovery), (yarn.nodemanager.recovery.enabled,false), (yarn.nodemanager.recovery.supervised,false), (yarn.nodemanager.remote-app-log-dir,wasbs://9e2f8fd9-9d5f-4acd-99b5-3885490a4d31@hobostoragenue9ivxr1n.blob.core.windows.net/app-logs), (yarn.nodemanager.remote-app-log-dir-include-older,true), (yarn.nodemanager.remote-app-log-dir-suffix,logs), (yarn.nodemanager.resource-monitor.interval-ms,3000), (yarn.nodemanager.resource-plugins.fpga.allowed-fpga-devices,auto), (yarn.nodemanager.resource-plugins.fpga.vendor-plugin.class,org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin), (yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices,auto), (yarn.nodemanager.resource-plugins.gpu.docker-plugin,nvidia-docker-v1), (yarn.nodemanager.resource-plugins.gpu.docker-plugin.nvidia-docker-v1.endpoint,http://localhost:3476/v1.0/docker/cli), (yarn.nodemanager.resource.count-logical-processors-as-cores,false), (yarn.nodemanager.resource.cpu-vcores,16), (yarn.nodemanager.resource.detect-hardware-capabilities,false), (yarn.nodemanager.resource.memory-mb,131072), (yarn.nodemanager.resource.memory.cgroups.soft-limit-percentage,90.0), (yarn.nodemanager.resource.memory.cgroups.swappiness,0), (yarn.nodemanager.resource.memory.enabled,false), (yarn.nodemanager.resource.memory.enforced,true), (yarn.nodemanager.resource.pcores-vcores-multiplier,1.0), (yarn.nodemanager.resource.percentage-physical-cpu-limit,80), (yarn.nodemanager.resource.system-reserved-memory-mb,-1), (yarn.nodemanager.resourcemanager.minimum.version,NONE), (yarn.nodemanager.runtime.linux.allowed-runtimes,default), (yarn.nodemanager.runtime.linux.docker.allowed-container-networks,host,none,bridge), (yarn.nodemanager.runtime.linux.docker.allowed-container-runtimes,runc), (yarn.nodemanager.runtime.linux.docker.capabilities,CHOWN,DAC_OVERRIDE,FSETID,FOWNER,MKNOD,NET_RAW,SETGID,SETUID,SETFCAP,SETPCAP,NET_BIND_SERVICE,SYS_CHROOT,KILL,AUDIT_WRITE), (yarn.nodemanager.runtime.linux.docker.default-container-network,host), (yarn.nodemanager.runtime.linux.docker.delayed-removal.allowed,false), (yarn.nodemanager.runtime.linux.docker.enable-userremapping.allowed,true), (yarn.nodemanager.runtime.linux.docker.host-pid-namespace.allowed,false), (yarn.nodemanager.runtime.linux.docker.image-update,false), (yarn.nodemanager.runtime.linux.docker.privileged-containers.allowed,false), (yarn.nodemanager.runtime.linux.docker.stop.grace-period,10), (yarn.nodemanager.runtime.linux.docker.userremapping-gid-threshold,1), (yarn.nodemanager.runtime.linux.docker.userremapping-uid-threshold,1), (yarn.nodemanager.runtime.linux.runc.allowed-container-networks,host,none,bridge), (yarn.nodemanager.runtime.linux.runc.allowed-container-runtimes,runc), (yarn.nodemanager.runtime.linux.runc.hdfs-manifest-to-resources-plugin.stat-cache-size,500), (yarn.nodemanager.runtime.linux.runc.hdfs-manifest-to-resources-plugin.stat-cache-timeout-interval-secs,360), (yarn.nodemanager.runtime.linux.runc.host-pid-namespace.allowed,false), (yarn.nodemanager.runtime.linux.runc.image-tag-to-manifest-plugin,org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.runc.ImageTagToManifestPlugin), (yarn.nodemanager.runtime.linux.runc.image-tag-to-manifest-plugin.cache-refresh-interval-secs,60), (yarn.nodemanager.runtime.linux.runc.image-tag-to-manifest-plugin.hdfs-hash-file,/runc-root/image-tag-to-hash), (yarn.nodemanager.runtime.linux.runc.image-tag-to-manifest-plugin.num-manifests-to-cache,10), (yarn.nodemanager.runtime.linux.runc.image-toplevel-dir,/runc-root), (yarn.nodemanager.runtime.linux.runc.layer-mounts-interval-secs,600), (yarn.nodemanager.runtime.linux.runc.layer-mounts-to-keep,100), (yarn.nodemanager.runtime.linux.runc.manifest-to-resources-plugin,org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.runc.HdfsManifestToResourcesPlugin), (yarn.nodemanager.runtime.linux.runc.privileged-containers.allowed,false), (yarn.nodemanager.runtime.linux.sandbox-mode,disabled), (yarn.nodemanager.runtime.linux.sandbox-mode.local-dirs.permissions,read), (yarn.nodemanager.sleep-delay-before-sigkill.ms,250), (yarn.nodemanager.user-home-dir,/home/trusted-service-user), (yarn.nodemanager.vmem-check-enabled,false), (yarn.nodemanager.vmem-pmem-ratio,3.1), (yarn.nodemanager.webapp.address,${yarn.nodemanager.hostname}:8042), (yarn.nodemanager.webapp.cross-origin.enabled,false), (yarn.nodemanager.webapp.https.address,0.0.0.0:8044), (yarn.nodemanager.webapp.rest-csrf.custom-header,X-XSRF-Header), (yarn.nodemanager.webapp.rest-csrf.enabled,false), (yarn.nodemanager.webapp.rest-csrf.methods-to-ignore,GET,OPTIONS,HEAD), (yarn.nodemanager.webapp.xfs-filter.xframe-options,SAMEORIGIN), (yarn.nodemanager.windows-container.cpu-limit.enabled,false), (yarn.nodemanager.windows-container.memory-limit.enabled,false), (yarn.registry.class,org.apache.hadoop.registry.client.impl.FSRegistryOperationsService), (yarn.resourcemanager.activities-manager.app-activities.max-queue-length,100), (yarn.resourcemanager.activities-manager.app-activities.ttl-ms,600000), (yarn.resourcemanager.activities-manager.cleanup-interval-ms,5000), (yarn.resourcemanager.activities-manager.scheduler-activities.ttl-ms,600000), (yarn.resourcemanager.address,${yarn.resourcemanager.hostname}:8032), (yarn.resourcemanager.admin.address,${yarn.resourcemanager.hostname}:8033), (yarn.resourcemanager.admin.client.thread-count,1), (yarn.resourcemanager.am-rm-tokens.master-key-rolling-interval-secs,86400), (yarn.resourcemanager.am.max-attempts,2), (yarn.resourcemanager.amlauncher.thread-count,50), (yarn.resourcemanager.application-https.policy,NONE), (yarn.resourcemanager.application-tag-based-placement.enable,false), (yarn.resourcemanager.application-timeouts.monitor.interval-ms,3000), (yarn.resourcemanager.application.max-tag.length,100), (yarn.resourcemanager.application.max-tags,10), (yarn.resourcemanager.auto-update.containers,false), (yarn.resourcemanager.client.thread-count,50), (yarn.resourcemanager.cluster-id,13895c74-bfb6-434f-97d3-556cec4e06ef), (yarn.resourcemanager.configuration.file-system-based-store,/yarn/conf), (yarn.resourcemanager.configuration.provider-class,org.apache.hadoop.yarn.LocalConfigurationProvider), (yarn.resourcemanager.connect.max-wait.ms,120000), (yarn.resourcemanager.connect.retry-interval.ms,2000), (yarn.resourcemanager.container-tokens.master-key-rolling-interval-secs,86400), (yarn.resourcemanager.container.liveness-monitor.interval-ms,600000), (yarn.resourcemanager.decommissioning-nodes-watcher.delay-ms,0), (yarn.resourcemanager.decommissioning-nodes-watcher.poll-interval-secs,20), (yarn.resourcemanager.delayed.delegation-token.removal-interval-ms,30000), (yarn.resourcemanager.delegation-token-renewer.thread-count,50), (yarn.resourcemanager.delegation-token-renewer.thread-retry-interval,60s), (yarn.resourcemanager.delegation-token-renewer.thread-retry-max-attempts,10), (yarn.resourcemanager.delegation-token-renewer.thread-timeout,60s), (yarn.resourcemanager.delegation-token.always-cancel,false), (yarn.resourcemanager.delegation-token.max-conf-size-bytes,12800), (yarn.resourcemanager.delegation.key.update-interval,86400000), (yarn.resourcemanager.delegation.token.max-lifetime,604800000), (yarn.resourcemanager.delegation.token.renew-interval,86400000), (yarn.resourcemanager.enable-node-untracked-without-include-path,true), (yarn.resourcemanager.epoch.range,0), (yarn.resourcemanager.fail-fast,${yarn.fail-fast}), (yarn.resourcemanager.fs.state-store.num-retries,0), (yarn.resourcemanager.fs.state-store.retry-interval-ms,1000), (yarn.resourcemanager.fs.state-store.uri,${hadoop.tmp.dir}/yarn/system/rmstore), (yarn.resourcemanager.ha.automatic-failover.embedded,true), (yarn.resourcemanager.ha.automatic-failover.enabled,true), (yarn.resourcemanager.ha.automatic-failover.zk-base-path,/yarn-leader-election), (yarn.resourcemanager.ha.enabled,true), (yarn.resourcemanager.ha.rm-ids,rm1,rm2), (yarn.resourcemanager.history-writer.multi-threaded-dispatcher.pool-size,10), (yarn.resourcemanager.hostname,0.0.0.0), (yarn.resourcemanager.hostname.rm1,vm-58f13156), (yarn.resourcemanager.hostname.rm2,vm-b0b35166), (yarn.resourcemanager.keytab,/etc/krb5.keytab), (yarn.resourcemanager.leveldb-state-store.compaction-interval-secs,3600), (yarn.resourcemanager.leveldb-state-store.path,${hadoop.tmp.dir}/yarn/system/rmstore), (yarn.resourcemanager.max-completed-applications,1000), (yarn.resourcemanager.max-log-aggregation-diagnostics-in-memory,10), (yarn.resourcemanager.metrics.runtime.buckets,60,300,1440), (yarn.resourcemanager.nm-container-queuing.load-comparator,QUEUE_LENGTH), (yarn.resourcemanager.nm-container-queuing.max-queue-length,15), (yarn.resourcemanager.nm-container-queuing.max-queue-wait-time-ms,100), (yarn.resourcemanager.nm-container-queuing.min-queue-length,5), (yarn.resourcemanager.nm-container-queuing.min-queue-wait-time-ms,10), (yarn.resourcemanager.nm-container-queuing.queue-limit-stdev,1.0f), (yarn.resourcemanager.nm-container-queuing.sorting-nodes-interval-ms,1000), (yarn.resourcemanager.nm-tokens.master-key-rolling-interval-secs,86400), (yarn.resourcemanager.node-ip-cache.expiry-interval-secs,-1), (yarn.resourcemanager.node-labels.provider.fetch-interval-ms,1800000), (yarn.resourcemanager.node-removal-untracked.timeout-ms,600000), (yarn.resourcemanager.nodemanager-connect-retries,10), (yarn.resourcemanager.nodemanager-graceful-decommission-timeout-secs,3600), (yarn.resourcemanager.nodemanager.minimum.version,NONE), (yarn.resourcemanager.nodemanagers.heartbeat-interval-max-ms,1000), (yarn.resourcemanager.nodemanagers.heartbeat-interval-min-ms,1000), (yarn.resourcemanager.nodemanagers.heartbeat-interval-ms,1000), (yarn.resourcemanager.nodemanagers.heartbeat-interval-scaling-enable,false), (yarn.resourcemanager.nodemanagers.heartbeat-interval-slowdown-factor,1.0), (yarn.resourcemanager.nodemanagers.heartbeat-interval-speedup-factor,1.0), (yarn.resourcemanager.nodes.exclude-path,/etc/hadoop/conf/yarn.exclude), (yarn.resourcemanager.opportunistic-container-allocation.enabled,false), (yarn.resourcemanager.opportunistic-container-allocation.nodes-used,10), (yarn.resourcemanager.opportunistic.max.container-allocation.per.am.heartbeat,-1), (yarn.resourcemanager.placement-constraints.algorithm.class,org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.algorithm.DefaultPlacementAlgorithm), (yarn.resourcemanager.placement-constraints.algorithm.iterator,SERIAL), (yarn.resourcemanager.placement-constraints.algorithm.pool-size,1), (yarn.resourcemanager.placement-constraints.handler,scheduler), (yarn.resourcemanager.placement-constraints.retry-attempts,3), (yarn.resourcemanager.placement-constraints.scheduler.pool-size,1), (yarn.resourcemanager.proxy-user-privileges.enabled,false), (yarn.resourcemanager.proxy.connection.timeout,60000), (yarn.resourcemanager.proxy.timeout.enabled,true), (yarn.resourcemanager.recovery.enabled,true), (yarn.resourcemanager.reservation-system.enable,false), (yarn.resourcemanager.reservation-system.planfollower.time-step,1000), (yarn.resourcemanager.resource-profiles.enabled,false), (yarn.resourcemanager.resource-profiles.source-file,resource-profiles.json), (yarn.resourcemanager.resource-tracker.address,${yarn.resourcemanager.hostname}:8031), (yarn.resourcemanager.resource-tracker.client.thread-count,50), (yarn.resourcemanager.resource-tracker.nm.ip-hostname-check,false), (yarn.resourcemanager.rm.container-allocation.expiry-interval-ms,600000), (yarn.resourcemanager.scheduler.address,${yarn.resourcemanager.hostname}:8030), (yarn.resourcemanager.scheduler.class,org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler), (yarn.resourcemanager.scheduler.client.thread-count,50), (yarn.resourcemanager.scheduler.monitor.enable,false), (yarn.resourcemanager.scheduler.monitor.policies,org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy), (yarn.resourcemanager.state-store.max-completed-applications,${yarn.resourcemanager.max-completed-applications}), (yarn.resourcemanager.store.class,org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore), (yarn.resourcemanager.submission-preprocessor.enabled,false), (yarn.resourcemanager.submission-preprocessor.file-refresh-interval-ms,60000), (yarn.resourcemanager.system-metrics-publisher.dispatcher.pool-size,10), (yarn.resourcemanager.system-metrics-publisher.timeline-server-v1.batch-size,1000), (yarn.resourcemanager.system-metrics-publisher.timeline-server-v1.enable-batch,false), (yarn.resourcemanager.system-metrics-publisher.timeline-server-v1.interval-seconds,60), (yarn.resourcemanager.webapp.address,${yarn.resourcemanager.hostname}:8088), (yarn.resourcemanager.webapp.address.rm1,vm-58f13156:8088), (yarn.resourcemanager.webapp.address.rm2,vm-b0b35166:8088), (yarn.resourcemanager.webapp.cross-origin.enabled,false), (yarn.resourcemanager.webapp.delegation-token-auth-filter.enabled,true), (yarn.resourcemanager.webapp.https.address,${yarn.resourcemanager.hostname}:8090), (yarn.resourcemanager.webapp.rest-csrf.custom-header,X-XSRF-Header), (yarn.resourcemanager.webapp.rest-csrf.enabled,false), (yarn.resourcemanager.webapp.rest-csrf.methods-to-ignore,GET,OPTIONS,HEAD), (yarn.resourcemanager.webapp.ui-actions.enabled,true), (yarn.resourcemanager.webapp.xfs-filter.xframe-options,SAMEORIGIN), (yarn.resourcemanager.work-preserving-recovery.enabled,true), (yarn.resourcemanager.work-preserving-recovery.scheduling-wait-ms,10000), (yarn.resourcemanager.zk-appid-node.split-index,0), (yarn.resourcemanager.zk-delegation-token-node.split-index,0), (yarn.resourcemanager.zk-max-znode-size.bytes,1048576), (yarn.resourcemanager.zk-state-store.parent-path,/rmstore), (yarn.rm.system-metrics-publisher.emit-container-events,false), (yarn.router.clientrm.interceptor-class.pipeline,org.apache.hadoop.yarn.server.router.clientrm.DefaultClientRequestInterceptor), (yarn.router.interceptor.user.threadpool-size,5), (yarn.router.pipeline.cache-max-size,25), (yarn.router.rmadmin.interceptor-class.pipeline,org.apache.hadoop.yarn.server.router.rmadmin.DefaultRMAdminRequestInterceptor), (yarn.router.webapp.address,0.0.0.0:8089), (yarn.router.webapp.https.address,0.0.0.0:8091), (yarn.router.webapp.interceptor-class.pipeline,org.apache.hadoop.yarn.server.router.webapp.DefaultRequestInterceptorREST), (yarn.scheduler.configuration.fs.path,file://${hadoop.tmp.dir}/yarn/system/schedconf), (yarn.scheduler.configuration.leveldb-store.compaction-interval-secs,86400), (yarn.scheduler.configuration.leveldb-store.path,${hadoop.tmp.dir}/yarn/system/confstore), (yarn.scheduler.configuration.max.version,100), (yarn.scheduler.configuration.mutation.acl-policy.class,org.apache.hadoop.yarn.server.resourcemanager.scheduler.DefaultConfigurationMutationACLPolicy), (yarn.scheduler.configuration.store.class,file), (yarn.scheduler.configuration.store.max-logs,1000), (yarn.scheduler.configuration.zk-store.parent-path,/confstore), (yarn.scheduler.include-port-in-node-name,false), (yarn.scheduler.maximum-allocation-mb,131072), (yarn.scheduler.maximum-allocation-vcores,16), (yarn.scheduler.minimum-allocation-mb,1024), (yarn.scheduler.minimum-allocation-vcores,1), (yarn.scheduler.queue-placement-rules,user-group), (yarn.sharedcache.admin.address,0.0.0.0:8047), (yarn.sharedcache.admin.thread-count,1), (yarn.sharedcache.app-checker.class,org.apache.hadoop.yarn.server.sharedcachemanager.RemoteAppChecker), (yarn.sharedcache.checksum.algo.impl,org.apache.hadoop.yarn.sharedcache.ChecksumSHA256Impl), (yarn.sharedcache.cleaner.initial-delay-mins,10), (yarn.sharedcache.cleaner.period-mins,1440), (yarn.sharedcache.cleaner.resource-sleep-ms,0), (yarn.sharedcache.client-server.address,0.0.0.0:8045), (yarn.sharedcache.client-server.thread-count,50), (yarn.sharedcache.enabled,false), (yarn.sharedcache.nested-level,3), (yarn.sharedcache.nm.uploader.replication.factor,10), (yarn.sharedcache.nm.uploader.thread-count,20), (yarn.sharedcache.root-dir,/sharedcache), (yarn.sharedcache.store.class,org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore), (yarn.sharedcache.store.in-memory.check-period-mins,720), (yarn.sharedcache.store.in-memory.initial-delay-mins,10), (yarn.sharedcache.store.in-memory.staleness-period-mins,10080), (yarn.sharedcache.uploader.server.address,0.0.0.0:8046), (yarn.sharedcache.uploader.server.thread-count,50), (yarn.sharedcache.webapp.address,0.0.0.0:8788), (yarn.system-metrics-publisher.enabled,false), (yarn.timeline-service.address,${yarn.timeline-service.hostname}:10200), (yarn.timeline-service.app-aggregation-interval-secs,15), (yarn.timeline-service.app-collector.linger-period.ms,60000), (yarn.timeline-service.client.best-effort,false), (yarn.timeline-service.client.drain-entities.timeout.ms,2000), (yarn.timeline-service.client.fd-clean-interval-secs,60), (yarn.timeline-service.client.fd-flush-interval-secs,10), (yarn.timeline-service.client.fd-retain-secs,300), (yarn.timeline-service.client.internal-timers-ttl-secs,420), (yarn.timeline-service.client.max-retries,30), (yarn.timeline-service.client.retry-interval-ms,1000), (yarn.timeline-service.enabled,false), (yarn.timeline-service.entity-group-fs-store.active-dir,/tmp/entity-file-history/active), (yarn.timeline-service.entity-group-fs-store.app-cache-size,10), (yarn.timeline-service.entity-group-fs-store.cache-store-class,org.apache.hadoop.yarn.server.timeline.MemoryTimelineStore), (yarn.timeline-service.entity-group-fs-store.cleaner-interval-seconds,3600), (yarn.timeline-service.entity-group-fs-store.done-dir,/tmp/entity-file-history/done/), (yarn.timeline-service.entity-group-fs-store.leveldb-cache-read-cache-size,10485760), (yarn.timeline-service.entity-group-fs-store.retain-seconds,604800), (yarn.timeline-service.entity-group-fs-store.scan-interval-seconds,60), (yarn.timeline-service.entity-group-fs-store.summary-store,org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore), (yarn.timeline-service.entity-group-fs-store.with-user-dir,false), (yarn.timeline-service.flowname.max-size,0), (yarn.timeline-service.generic-application-history.max-applications,10000), (yarn.timeline-service.handler-thread-count,10), (yarn.timeline-service.hbase-schema.prefix,prod.), (yarn.timeline-service.hbase.coprocessor.app-final-value-retention-milliseconds,259200000), (yarn.timeline-service.hbase.coprocessor.jar.hdfs.location,/hbase/coprocessor/hadoop-yarn-server-timelineservice.jar), (yarn.timeline-service.hostname,0.0.0.0), (yarn.timeline-service.http-authentication.simple.anonymous.allowed,true), (yarn.timeline-service.http-authentication.type,simple), (yarn.timeline-service.http-cross-origin.enabled,false), (yarn.timeline-service.keytab,/etc/krb5.keytab), (yarn.timeline-service.leveldb-state-store.path,${hadoop.tmp.dir}/yarn/timeline), (yarn.timeline-service.leveldb-timeline-store.path,${hadoop.tmp.dir}/yarn/timeline), (yarn.timeline-service.leveldb-timeline-store.read-cache-size,104857600), (yarn.timeline-service.leveldb-timeline-store.start-time-read-cache-size,10000), (yarn.timeline-service.leveldb-timeline-store.start-time-write-cache-size,10000), (yarn.timeline-service.leveldb-timeline-store.ttl-interval-ms,300000), (yarn.timeline-service.reader.class,org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl), (yarn.timeline-service.reader.webapp.address,${yarn.timeline-service.webapp.address}), (yarn.timeline-service.reader.webapp.https.address,${yarn.timeline-service.webapp.https.address}), (yarn.timeline-service.recovery.enabled,false), (yarn.timeline-service.state-store-class,org.apache.hadoop.yarn.server.timeline.recovery.LeveldbTimelineStateStore), (yarn.timeline-service.store-class,org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore), (yarn.timeline-service.timeline-client.number-of-async-entities-to-merge,10), (yarn.timeline-service.ttl-enable,true), (yarn.timeline-service.ttl-ms,604800000), (yarn.timeline-service.version,1.0f), (yarn.timeline-service.webapp.address,${yarn.timeline-service.hostname}:8188), (yarn.timeline-service.webapp.https.address,${yarn.timeline-service.hostname}:8190), (yarn.timeline-service.webapp.rest-csrf.custom-header,X-XSRF-Header), (yarn.timeline-service.webapp.rest-csrf.enabled,false), (yarn.timeline-service.webapp.rest-csrf.methods-to-ignore,GET,OPTIONS,HEAD), (yarn.timeline-service.webapp.xfs-filter.xframe-options,SAMEORIGIN), (yarn.timeline-service.writer.async.queue.capacity,100), (yarn.timeline-service.writer.class,org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineWriterImpl), (yarn.timeline-service.writer.flush-interval-seconds,60), (yarn.webapp.api-service.enable,false), (yarn.webapp.enable-rest-app-submissions,true), (yarn.webapp.filter-entity-list-by-user,false), (yarn.webapp.filter-invalid-xml-chars,false), (yarn.webapp.ui2.enable,false), (yarn.webapp.xfs-filter.enabled,true), (yarn.workflow-id.tag-prefix,workflowid:)), System Properties -> Vector((awt.toolkit,sun.awt.X11.XToolkit), (etwlogger.component,sparkdriver), (file.encoding,UTF-8), (file.encoding.pkg,sun.io), (file.separator,/), (java.awt.graphicsenv,sun.awt.X11GraphicsEnvironment), (java.awt.printerjob,sun.print.PSPrinterJob), (java.class.version,52.0), (java.endorsed.dirs,/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/endorsed), (java.ext.dirs,/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/ext:/usr/java/packages/lib/ext), (java.home,/usr/lib/jvm/java-8-openjdk-amd64/jre), (java.io.tmpdir,/mnt/var/hadoop/tmp/nm-local-dir/usercache/trusted-service-user/appcache/application_1718175835080_0001/container_1718175835080_0001_01_000001/tmp), (java.library.path,/usr/hdp/current/hadoop-client/lib/native::/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib), (java.runtime.name,OpenJDK Runtime Environment), (java.runtime.version,1.8.0_402-8u402-ga-2ubuntu1~18.04-b06), (java.specification.maintenance.version,5), (java.specification.name,Java Platform API Specification), (java.specification.vendor,Oracle Corporation), (java.specification.version,1.8), (java.vendor,Private Build), (java.vendor.url,http://java.oracle.com/), (java.vendor.url.bug,http://bugreport.sun.com/bugreport/), (java.version,1.8.0_402), (java.vm.info,mixed mode), (java.vm.name,OpenJDK 64-Bit Server VM), (java.vm.specification.name,Java Virtual Machine Specification), (java.vm.specification.vendor,Oracle Corporation), (java.vm.specification.version,1.8), (java.vm.vendor,Private Build), (java.vm.version,25.402-b06), (javax.xml.parsers.SAXParserFactory,com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl), (jdk.jar.maxSignatureFileSize,2147483639), (jetty.git.hash,27bde00a0b95a1d5bbee0eae7984f891d2d0f8c9), (line.separator, ), (log4j2.configurationFile,file:/usr/hdp/current/spark3-client/conf/driver-log4j2.properties), (log4jspark.log.dir,/var/log/sparkapp/${user.name}), (log4jspark.log.file,sparkdriver.log), (log4jspark.root.logger,INFO,console,RFA,ETW,Anonymizer), (logFilter.filename,SparkLogFilters.xml), (os.arch,amd64), (os.name,Linux), (os.version,4.15.0-1177-azure), (path.separator,:), (patternGroup.filename,SparkPatternGroups.xml), (sun.arch.data.model,64), (sun.boot.class.path,/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/resources.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/sunrsasign.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/jsse.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/jce.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/charsets.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/jfr.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/classes), (sun.boot.library.path,/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64), (sun.cpu.endian,little), (sun.cpu.isalist,), (sun.io.unicode.encoding,UnicodeLittle), (sun.java.command,org.apache.spark.deploy.yarn.ApplicationMaster --class org.apache.spark.deploy.PythonRunner --primary-py-file synapse_control_script_v2.py --arg --snapshots --arg [{"Id":"30bc81f4-f7cd-4bcc-b41e-4a18b28b886e","PathStack":["."],"SnapshotEntityId":null,"SnapshotAssetId":null}] --arg -i --arg ProjectPythonPath:context_managers.ProjectPythonPath --arg -i --arg TrackUserError:context_managers.TrackUserError --arg pipeline/data_preparation.py --arg --wrangled_data --arg $AZURE_ML_OUTPUT_wrangled_data --properties-file /mnt/var/hadoop/tmp/nm-local-dir/usercache/trusted-service-user/appcache/application_1718175835080_0001/container_1718175835080_0001_01_000001/__spark_conf__/__spark_conf__.properties --dist-cache-conf /mnt/var/hadoop/tmp/nm-local-dir/usercache/trusted-service-user/appcache/application_1718175835080_0001/container_1718175835080_0001_01_000001/__spark_conf__/__spark_dist_cache__.properties --property-merge-rules-file /mnt/var/hadoop/tmp/nm-local-dir/usercache/trusted-service-user/appcache/application_1718175835080_0001/container_1718175835080_0001_01_000001/__spark_conf__/__spark_conf_merge_rule__.properties), (sun.java.launcher,SUN_STANDARD), (sun.jnu.encoding,UTF-8), (sun.management.compiler,HotSpot 64-Bit Tiered Compilers), (sun.os.patch.level,unknown), (user.dir,/mnt/var/hadoop/tmp/nm-local-dir/usercache/trusted-service-user/appcache/application_1718175835080_0001/container_1718175835080_0001_01_000001), (user.home,/home/trusted-service-user), (user.language,en), (user.name,trusted-service-user), (user.timezone,Etc/UTC)), JVM Information -> List((Java Home,/usr/lib/jvm/java-8-openjdk-amd64/jre), (Java Version,1.8.0_402 (Private Build)), (Scala Version,version 2.12.15)))) by listener DefaultsConfigSparkListener took 21.236292829s. 2024-06-12 07:06:19,200 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_31_piece0 on vm-58f13156:40101 in memory (size: 16.6 KiB, free: 2.2 GiB) 2024-06-12 07:06:19,201 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_31_piece0 on vm-58f13156:42761 in memory (size: 16.6 KiB, free: 3.0 GiB) 2024-06-12 07:06:19,205 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_35_piece0 on vm-58f13156:42761 in memory (size: 4.0 KiB, free: 3.0 GiB) 2024-06-12 07:06:19,206 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_35_piece0 on vm-58f13156:40101 in memory (size: 4.0 KiB, free: 2.2 GiB) 2024-06-12 07:06:19,209 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_25_piece0 on vm-58f13156:42761 in memory (size: 14.6 KiB, free: 3.0 GiB) 2024-06-12 07:06:19,209 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_25_piece0 on vm-58f13156:40101 in memory (size: 14.6 KiB, free: 2.2 GiB) 2024-06-12 07:06:19,217 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_22_piece0 on vm-58f13156:42761 in memory (size: 14.6 KiB, free: 3.0 GiB) 2024-06-12 07:06:19,218 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_22_piece0 on vm-58f13156:40101 in memory (size: 14.6 KiB, free: 2.2 GiB) 2024-06-12 07:06:19,219 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_22_piece0 on vm-14223739:44757 in memory (size: 14.6 KiB, free: 2.2 GiB) 2024-06-12 07:06:19,223 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_30_piece0 on vm-58f13156:42761 in memory (size: 15.2 KiB, free: 3.0 GiB) 2024-06-12 07:06:19,224 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_30_piece0 on vm-58f13156:40101 in memory (size: 15.2 KiB, free: 2.2 GiB) 2024-06-12 07:06:19,227 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_20_piece0 on vm-58f13156:42761 in memory (size: 7.5 KiB, free: 3.0 GiB) 2024-06-12 07:06:19,228 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_20_piece0 on vm-58f13156:40101 in memory (size: 7.5 KiB, free: 2.2 GiB) 2024-06-12 07:06:19,229 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_20_piece0 on vm-14223739:44757 in memory (size: 7.5 KiB, free: 2.2 GiB) 2024-06-12 07:06:19,234 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_34_piece0 on vm-58f13156:42761 in memory (size: 24.6 KiB, free: 3.0 GiB) 2024-06-12 07:06:19,234 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_34_piece0 on vm-58f13156:40101 in memory (size: 24.6 KiB, free: 2.2 GiB) 2024-06-12 07:06:19,237 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_32_piece0 on vm-58f13156:42761 in memory (size: 16.5 KiB, free: 3.0 GiB) 2024-06-12 07:06:19,237 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_32_piece0 on vm-58f13156:40101 in memory (size: 16.5 KiB, free: 2.2 GiB) 2024-06-12 07:06:19,241 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_26_piece0 on vm-58f13156:42761 in memory (size: 20.1 KiB, free: 3.0 GiB) 2024-06-12 07:06:19,243 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_26_piece0 on vm-58f13156:40101 in memory (size: 20.1 KiB, free: 2.2 GiB) 2024-06-12 07:06:19,243 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Removed broadcast_26_piece0 on vm-14223739:44757 in memory (size: 20.1 KiB, free: 2.2 GiB) 2024-06-12 07:06:19,573 INFO CreateAdviseEventHandler [spark-listener-group-shared]: Sending DataSkew to Advise Hub: Map(_source -> user, _jobGroupId -> -1, _detail -> null, _executionId -> -1, _stageId -> -1, _jobId -> 15, DETAIL -> org.apache.spark.advise.output.AdviseDetailWrapper@1f2e49b3, _level -> warn, _stageAttemptId -> -1, _name -> Data skew for job 15) 2024-06-12 07:06:19,573 INFO KustoHandler [spark-listener-group-shared]: Logging DataSkew with appId: application_1718175835080_0001 to Kusto: Map(_source -> user, _jobGroupId -> -1, _detail -> null, _executionId -> -1, _stageId -> -1, _jobId -> 15, DETAIL -> org.apache.spark.advise.output.AdviseDetailWrapper@1f2e49b3, _level -> warn, _stageAttemptId -> -1, _name -> Data skew for job 15) 2024-06-12 07:06:20,269 INFO TaskSetManager [task-result-getter-1]: Finished task 2.0 in stage 30.0 (TID 219) in 10305 ms on vm-14223739 (executor 2) (7/10) 2024-06-12 07:06:20,595 INFO TaskSetManager [task-result-getter-3]: Finished task 4.0 in stage 30.0 (TID 221) in 10455 ms on vm-14223739 (executor 2) (8/10) 2024-06-12 07:06:20,635 INFO TaskSetManager [task-result-getter-2]: Finished task 3.0 in stage 30.0 (TID 220) in 10598 ms on vm-14223739 (executor 2) (9/10) 2024-06-12 07:06:20,674 INFO TaskSetManager [task-result-getter-0]: Finished task 5.0 in stage 30.0 (TID 222) in 9733 ms on vm-14223739 (executor 2) (10/10) 2024-06-12 07:06:20,674 INFO YarnClusterScheduler [task-result-getter-0]: Removed TaskSet 30.0, whose tasks have all completed, from pool 2024-06-12 07:06:20,675 INFO DAGScheduler [dag-scheduler-event-loop]: ShuffleMapStage 30 (showString at NativeMethodAccessorImpl.java:0) finished in 21.142 s 2024-06-12 07:06:20,675 INFO DAGScheduler [dag-scheduler-event-loop]: looking for newly runnable stages 2024-06-12 07:06:20,675 INFO DAGScheduler [dag-scheduler-event-loop]: running: Set() 2024-06-12 07:06:20,675 INFO DAGScheduler [dag-scheduler-event-loop]: waiting: Set() 2024-06-12 07:06:20,675 INFO DAGScheduler [dag-scheduler-event-loop]: failed: Set() 2024-06-12 07:06:20,687 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:20,687 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:20,687 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:20,687 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:20,687 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:20,687 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:20,687 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:20,687 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:20,688 INFO EnsureOptimalPartitioningHelper [Thread-47]: column stats for List(CustomerID#724L) does not exist 2024-06-12 07:06:20,688 INFO EnsureOptimalPartitioningHelper [Thread-47]: stats doesn't allow to use List(CustomerID#724L), returning default shuffle keys 2024-06-12 07:06:20,692 INFO ShufflePartitionsUtil [Thread-47]: For shuffle(6, 9, 10), advisory target size: 67108864, actual target size 67108864, minimum partition size: 1048576 2024-06-12 07:06:20,722 INFO TorrentBroadcast [Thread-47]: Started reading broadcast variable 36 with 5 pieces (estimated total size 20.0 MiB) 2024-06-12 07:06:20,723 INFO TorrentBroadcast [Thread-47]: Reading broadcast variable 36 took 1 ms 2024-06-12 07:06:20,810 INFO CodeGenerator [Thread-47]: Code generated in 12.588801 ms 2024-06-12 07:06:20,854 INFO CodeGenerator [Thread-47]: Code generated in 8.70217 ms 2024-06-12 07:06:20,888 INFO LighterServerPlugin [Thread-47]: Loaded Lighter server plugin: org.apache.spark.lighter.DefaultLighterServerPlugin 2024-06-12 07:06:20,896 INFO SparkContext [Thread-47]: Starting job: showString at NativeMethodAccessorImpl.java:0 2024-06-12 07:06:20,897 INFO DAGScheduler [dag-scheduler-event-loop]: Got job 25 (showString at NativeMethodAccessorImpl.java:0) with 1 output partitions 2024-06-12 07:06:20,897 INFO DAGScheduler [dag-scheduler-event-loop]: Final stage: ResultStage 48 (showString at NativeMethodAccessorImpl.java:0) 2024-06-12 07:06:20,897 INFO DAGScheduler [dag-scheduler-event-loop]: Parents of final stage: List(ShuffleMapStage 45, ShuffleMapStage 46, ShuffleMapStage 47) 2024-06-12 07:06:20,897 INFO DAGScheduler [dag-scheduler-event-loop]: Missing parents: List() 2024-06-12 07:06:20,898 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting ResultStage 48 (MapPartitionsRDD[100] at showString at NativeMethodAccessorImpl.java:0), which has no missing parents 2024-06-12 07:06:20,906 INFO LighterServerPlugin [dag-scheduler-event-loop]: Loaded Lighter server plugin: org.apache.spark.lighter.DefaultLighterServerPlugin 2024-06-12 07:06:20,908 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_37 stored as values in memory (estimated size 191.3 KiB, free 3.0 GiB) 2024-06-12 07:06:20,910 INFO MemoryStore [dag-scheduler-event-loop]: Block broadcast_37_piece0 stored as bytes in memory (estimated size 67.8 KiB, free 3.0 GiB) 2024-06-12 07:06:20,910 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_37_piece0 in memory on vm-58f13156:42761 (size: 67.8 KiB, free: 3.0 GiB) 2024-06-12 07:06:20,911 INFO SparkContext [dag-scheduler-event-loop]: Created broadcast 37 from broadcast at DAGScheduler.scala:1521 2024-06-12 07:06:20,911 INFO DAGScheduler [dag-scheduler-event-loop]: Submitting 1 missing tasks from ResultStage 48 (MapPartitionsRDD[100] at showString at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0)) 2024-06-12 07:06:20,911 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Adding task set 48.0 with 1 tasks resource profile 0 2024-06-12 07:06:20,912 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.0 in stage 48.0 (TID 250) (vm-58f13156, executor 1, partition 0, NODE_LOCAL, 8153 bytes) taskResourceAssignments Map() 2024-06-12 07:06:20,919 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_37_piece0 in memory on vm-58f13156:40101 (size: 67.8 KiB, free: 2.2 GiB) 2024-06-12 07:06:21,027 INFO MapOutputTrackerMasterEndpoint [dispatcher-event-loop-1]: Asked to send map output locations for shuffle 6 to 10.0.32.6:33208 2024-06-12 07:06:21,051 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_36_piece0 in memory on vm-58f13156:40101 (size: 4.0 MiB, free: 2.2 GiB) 2024-06-12 07:06:21,060 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_36_piece1 in memory on vm-58f13156:40101 (size: 4.0 MiB, free: 2.2 GiB) 2024-06-12 07:06:21,070 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_36_piece3 in memory on vm-58f13156:40101 (size: 4.0 MiB, free: 2.2 GiB) 2024-06-12 07:06:21,075 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_36_piece4 in memory on vm-58f13156:40101 (size: 593.6 KiB, free: 2.2 GiB) 2024-06-12 07:06:21,085 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_36_piece2 in memory on vm-58f13156:40101 (size: 4.0 MiB, free: 2.2 GiB) 2024-06-12 07:06:21,148 INFO MapOutputTrackerMasterEndpoint [dispatcher-event-loop-0]: Asked to send map output locations for shuffle 9 to 10.0.32.6:33208 2024-06-12 07:06:21,862 INFO MapOutputTrackerMasterEndpoint [dispatcher-event-loop-1]: Asked to send map output locations for shuffle 10 to 10.0.32.6:33208 2024-06-12 07:06:22,124 INFO BlockManagerInfo [dispatcher-BlockManagerMaster]: Added broadcast_33_piece0 in memory on vm-58f13156:40101 (size: 9.5 KiB, free: 2.2 GiB) 2024-06-12 07:11:00,050 WARN TaskSetManager [task-result-getter-1]: Lost task 0.0 in stage 48.0 (TID 250) (vm-58f13156 executor 1): org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/home/trusted-service-user/cluster-env/env/lib/python3.10/site-packages/mlflow/pyfunc/__init__.py", line 1069, in udf pyfunc_backend.prepare_env( File "/home/trusted-service-user/cluster-env/env/lib/python3.10/site-packages/mlflow/pyfunc/backend.py", line 89, in prepare_env conda_env_path = os.path.join(local_path, self._config[ENV]) File "/home/trusted-service-user/cluster-env/env/lib/python3.10/posixpath.py", line 90, in join genericpath._check_arg_types('join', a, *p) File "/home/trusted-service-user/cluster-env/env/lib/python3.10/genericpath.py", line 152, in _check_arg_types raise TypeError(f'{funcname}() argument must be str, bytes, or ' TypeError: join() argument must be str, bytes, or os.PathLike object, not 'dict' at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:559) at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.read(PythonArrowOutput.scala:101) at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.read(PythonArrowOutput.scala:50) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:512) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:400) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:897) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:897) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:57) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:366) at org.apache.spark.rdd.RDD.iterator(RDD.scala:330) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:136) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) 2024-06-12 07:11:00,053 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.1 in stage 48.0 (TID 251) (vm-58f13156, executor 1, partition 0, NODE_LOCAL, 8153 bytes) taskResourceAssignments Map() 2024-06-12 07:15:38,141 WARN TaskSetManager [task-result-getter-3]: Lost task 0.1 in stage 48.0 (TID 251) (vm-58f13156 executor 1): org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/home/trusted-service-user/cluster-env/env/lib/python3.10/site-packages/mlflow/pyfunc/__init__.py", line 1069, in udf pyfunc_backend.prepare_env( File "/home/trusted-service-user/cluster-env/env/lib/python3.10/site-packages/mlflow/pyfunc/backend.py", line 89, in prepare_env conda_env_path = os.path.join(local_path, self._config[ENV]) File "/home/trusted-service-user/cluster-env/env/lib/python3.10/posixpath.py", line 90, in join genericpath._check_arg_types('join', a, *p) File "/home/trusted-service-user/cluster-env/env/lib/python3.10/genericpath.py", line 152, in _check_arg_types raise TypeError(f'{funcname}() argument must be str, bytes, or ' TypeError: join() argument must be str, bytes, or os.PathLike object, not 'dict' at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:559) at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.read(PythonArrowOutput.scala:101) at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.read(PythonArrowOutput.scala:50) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:512) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:400) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:897) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:897) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:57) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:366) at org.apache.spark.rdd.RDD.iterator(RDD.scala:330) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:136) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) 2024-06-12 07:15:38,142 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.2 in stage 48.0 (TID 252) (vm-58f13156, executor 1, partition 0, NODE_LOCAL, 8153 bytes) taskResourceAssignments Map() 2024-06-12 07:20:17,341 WARN TaskSetManager [task-result-getter-2]: Lost task 0.2 in stage 48.0 (TID 252) (vm-58f13156 executor 1): org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/home/trusted-service-user/cluster-env/env/lib/python3.10/site-packages/mlflow/pyfunc/__init__.py", line 1069, in udf pyfunc_backend.prepare_env( File "/home/trusted-service-user/cluster-env/env/lib/python3.10/site-packages/mlflow/pyfunc/backend.py", line 89, in prepare_env conda_env_path = os.path.join(local_path, self._config[ENV]) File "/home/trusted-service-user/cluster-env/env/lib/python3.10/posixpath.py", line 90, in join genericpath._check_arg_types('join', a, *p) File "/home/trusted-service-user/cluster-env/env/lib/python3.10/genericpath.py", line 152, in _check_arg_types raise TypeError(f'{funcname}() argument must be str, bytes, or ' TypeError: join() argument must be str, bytes, or os.PathLike object, not 'dict' at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:559) at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.read(PythonArrowOutput.scala:101) at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.read(PythonArrowOutput.scala:50) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:512) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:400) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:897) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:897) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:57) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:366) at org.apache.spark.rdd.RDD.iterator(RDD.scala:330) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:136) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) 2024-06-12 07:20:17,343 INFO TaskSetManager [dispatcher-CoarseGrainedScheduler]: Starting task 0.3 in stage 48.0 (TID 253) (vm-58f13156, executor 1, partition 0, NODE_LOCAL, 8153 bytes) taskResourceAssignments Map() 2024-06-12 07:24:56,986 WARN TaskSetManager [task-result-getter-0]: Lost task 0.3 in stage 48.0 (TID 253) (vm-58f13156 executor 1): org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/home/trusted-service-user/cluster-env/env/lib/python3.10/site-packages/mlflow/pyfunc/__init__.py", line 1069, in udf pyfunc_backend.prepare_env( File "/home/trusted-service-user/cluster-env/env/lib/python3.10/site-packages/mlflow/pyfunc/backend.py", line 89, in prepare_env conda_env_path = os.path.join(local_path, self._config[ENV]) File "/home/trusted-service-user/cluster-env/env/lib/python3.10/posixpath.py", line 90, in join genericpath._check_arg_types('join', a, *p) File "/home/trusted-service-user/cluster-env/env/lib/python3.10/genericpath.py", line 152, in _check_arg_types raise TypeError(f'{funcname}() argument must be str, bytes, or ' TypeError: join() argument must be str, bytes, or os.PathLike object, not 'dict' at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:559) at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.read(PythonArrowOutput.scala:101) at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.read(PythonArrowOutput.scala:50) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:512) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:400) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:897) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:897) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:57) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:366) at org.apache.spark.rdd.RDD.iterator(RDD.scala:330) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:136) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) 2024-06-12 07:24:56,988 ERROR TaskSetManager [task-result-getter-0]: Task 0 in stage 48.0 failed 4 times; aborting job 2024-06-12 07:24:56,989 INFO YarnClusterScheduler [task-result-getter-0]: Removed TaskSet 48.0, whose tasks have all completed, from pool 2024-06-12 07:24:56,991 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Cancelling stage 48 2024-06-12 07:24:56,991 INFO YarnClusterScheduler [dag-scheduler-event-loop]: Killing all running tasks in stage 48: Stage cancelled 2024-06-12 07:24:56,992 INFO DAGScheduler [dag-scheduler-event-loop]: ResultStage 48 (showString at NativeMethodAccessorImpl.java:0) failed in 1116.093 s due to Job aborted due to stage failure: Task 0 in stage 48.0 failed 4 times, most recent failure: Lost task 0.3 in stage 48.0 (TID 253) (vm-58f13156 executor 1): org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/home/trusted-service-user/cluster-env/env/lib/python3.10/site-packages/mlflow/pyfunc/__init__.py", line 1069, in udf pyfunc_backend.prepare_env( File "/home/trusted-service-user/cluster-env/env/lib/python3.10/site-packages/mlflow/pyfunc/backend.py", line 89, in prepare_env conda_env_path = os.path.join(local_path, self._config[ENV]) File "/home/trusted-service-user/cluster-env/env/lib/python3.10/posixpath.py", line 90, in join genericpath._check_arg_types('join', a, *p) File "/home/trusted-service-user/cluster-env/env/lib/python3.10/genericpath.py", line 152, in _check_arg_types raise TypeError(f'{funcname}() argument must be str, bytes, or ' TypeError: join() argument must be str, bytes, or os.PathLike object, not 'dict' at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:559) at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.read(PythonArrowOutput.scala:101) at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.read(PythonArrowOutput.scala:50) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:512) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:400) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:897) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:897) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:57) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:366) at org.apache.spark.rdd.RDD.iterator(RDD.scala:330) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:136) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Driver stacktrace: 2024-06-12 07:24:56,993 INFO DAGScheduler [Thread-47]: Job 25 failed: showString at NativeMethodAccessorImpl.java:0, took 1116.096956 s 2024-06-12 07:24:57,031 INFO CreateAdviseEventHandler [spark-listener-group-shared]: Sending TaskError to Advise Hub: Map(_source -> user, _jobGroupId -> -1, _detail -> null, _executionId -> -1, _jobId -> 25, DETAIL -> org.apache.spark.advise.output.AdviseDetailWrapper@60066486, _level -> error, _description -> , _name -> Job 25 error summary) 2024-06-12 07:24:57,031 INFO KustoHandler [spark-listener-group-shared]: Logging TaskError with appId: application_1718175835080_0001 to Kusto: Map(_source -> user, _jobGroupId -> -1, _detail -> null, _executionId -> -1, _jobId -> 25, DETAIL -> org.apache.spark.advise.output.AdviseDetailWrapper@60066486, _level -> error, _description -> , _name -> Job 25 error summary)