HomeExamsBig DataTETABDAPRFIC3020
TETABDAPRFIC3020

Infosys Certified Spark Professional

Practice with real exam-pattern questions for Infosys Certified Spark Professional. Each question includes a detailed explanation to help you understand the concept, not just memorise the answer. Try 10 questions free — no login required.

IntermediateBig Data60 min
Free questions

10 Infosys Certified Spark Professional practice questions with answers

Real Lex exam-pattern multiple-choice questions for the Infosys Certified Spark Professional certification. Each question includes the correct answer. The full question bank is available to Premium members.

  1. Question 1

    Which of the following statements are TRUE about Spark framework? Choose THREE CORRECT options from below.

    • Supports in-memory data processing

      Correct
    • B

      Follows lazy evaluation principle

    • C

      Does not provide machine learning libraries

    • D

      Supports parallel processing

  2. Question 2

    Select the number of stages that will be generated from the DAG while executing the below code?

    val rdd1 = sc.textFile("Customer")

    val rdd2 = rdd1.map(_.split(",")).map(arr1 => (arr1(2),arr1(4).toInt))

    rdd2.cache()

    val rdd3 = rdd2.reduceByKey(_ max _)

    rdd3.saveAsTextFile("output1")

    • 1

      Correct
    • B

      2

    • C

      3

    • D

      4

  3. Question 3

    While Spark job execution, program gets converted into a lineage graph. Which of the following statements are TRUE with respect to RDD Lineage Graph? Choose THREE CORRECT options from below.

    • Unless an action statement, graph does not get submitted for execution

      Correct
    • B

      Transformations resulting data shuffling are mandatory in lineage graph

    • C

      Lineage graph is generated out of the transformations in the program

    • D

      Lost RDD partitions can be recovered using the lineage graph

  4. Question 4

    Which of the following methods will give the count of number of partitions created of an RDD? Choose three correct options.

    • rdd.getNumPartitions

      Correct
    • B

      rdd.partitions.length

    • C

      rdd.partitions.size

    • D

      rdd.partitions

  5. Question 5

    Which of the following method is used to get the RDD lineage graph in Spark?

    • Stats

      Correct
    • B

      toDebugString

    • C

      dependencies

    • D

      glom

  6. Question 6

    Select the following option which will display the record starting with the word "hadoop" of an RDD.

    • rdd.filter(x => x.startsWith("hadoop")).collect

      Correct
    • B

      rdd.filter(x => x.contains("hadoop")).collect

    • C

      rdd.filter(x => x.contains("hadoop")).first

    • D

      rdd.filter(x => x.starts("hadoop")).collect

  7. Question 7

    Which of the given Scala function is used in Spark for changing the number of partitions in a RDD?

    • rdd.changePartition(newnumberOfPartitions)

      Correct
    • B

      rdd.changePartition(oldnumberOfPartitions,newnumberOfPartitions)

    • C

      rdd.repartition(newnumberOfPartitions)

    • D

      rdd.repartition().change(newnumberOfPartitions)

  8. Question 8

    Consider sales dataset with column names as CustomerID, Location, Merchant, Amount. Requirement is to create a paired RDD with CustomerID as key and Amount as value. Which of the below code snippet is correct to create a paired RDD.

    • val SalesData = sc.textFile("HDFS Path")

      val PairedSalesData = SalesData.map{record => (record.split(",")(0),record.split(",")(3).toLong)}

      Correct
    • B

      val SalesData = sc.textFile("HDFS Path")

      val PairedSalesData = SalesData.flatMap{record => (record.split(",")(1),record.split(",")(4).toLong)}

    • C

      val SalesData = sc.textFile("HDFS Path")

      val PairedSalesData = SalesData.reduce{record => (record.split(",")(0),record.split(",")(3).toLong)}

    • D

      val SalesData = sc.textFile("HDFS Path")

      val PairedSalesData = SalesData.filter{record => (record.split(",")(0),record.split(",")(3).toLong)}

  9. Question 9

    What is the output of the given code snipet:

    val RDD1 = sc.parallelize(Array(1,2))
    val RDD2 = sc.parallelize (Array(2,3))
    val product=RDD1.cartesian(RDD2)
    val RDD3 = sc.parallelize(Seq((1,2),(2,3),(3,4)))
    product.join(RDD3).collect

    • Type mismatch error. Since integer RDD is combined with pairedRDD in the join

      Correct
    • B

      Syntax error since it should be collect(), not collect

    • C

      Array(((1,(2,2)), (1,(3,2)), (2,(2,3)), (2,(3,3))))

    • D

      Array((1,(2,2,3)),(2,(2,3,3)),(3,(0,0,4)))

  10. Question 10

    Which of the following function(s) are RDD action statements? Choose THREE CORRECT options from below.

    • foreach()

      Correct
    • B

      collect()

    • C

      reduceByKey()

    • D

      take(n)

Pricing

Pay once. Clear every cert this year.

One subscription, full Telegram channel access, every PDF posted during your membership.

Monthly
50% OFF
₹1,300₹2,600
Per month · cancel anytime
  • Full access to all 1,357+ certifications
  • Monthly updated question banks
  • Telegram private channel access
  • Cancel anytime
Get Monthly
POPULAR
Quarterly
44% OFF
₹1,800₹3,200
That's ₹600/mo · billed for 3 months
  • Everything in Monthly
  • Save ₹2,100 vs monthly billing
  • Priority answer key requests
  • Best for increasing DQ score fast
Get Quarterly
BEST VALUE
Lifetime
52% OFF
₹2,400₹5,000
One-time · lifetime access
  • Everything in Quarterly
  • Lifetime channel access — no renewals
  • All future certifications included
  • Priority response from admin team
Get Lifetime
FAQ

Common questions, straight answers.

A monthly-updated Telegram channel where we post real exam-pattern question banks and detailed answer keys for 1,357+ Infosys Lex certifications. You join once, you get every PDF posted during your membership.

Right after payment on our Graphy page, you'll receive a private invite link to the Telegram channel. Access is instant — usually under 30 seconds.

We compile question banks from the actual Lex test pattern, sourced and verified by 180K+ community members who've recently cleared these exams. Match rate is consistently 85–95%.

Every single month. When Infosys rolls out new versions of certifications, we post updated dumps within 7–10 days. You'll see channel activity weekly.

Clearing certifications is one of the highest-weighted DQ factors. Members typically clear 3–5 certifications in their first 3 months, which moves DQ scores up by a full band.

i
InfyLexDumps

Independent exam preparation platform for Infosys Lex certifications. Real exam-pattern question banks, monthly updates, 180K+ community members.

Join Premium Telegram
Contact
  • @prepflixadmin
  • admin@prepflix.net
This platform is an independent educational resource and is not affiliated with or endorsed by Infosys Ltd. All certification names referenced are property of their respective owners.
© 2026 InfyLexDumps
Join Premium Telegram