AppiumDriver Basics

Get introduced to WebDriver basics and the methods available in it.

About WebDriver

WebDriver is an interface that acts as a bridge between commands (finding, clicking, typing keys, scrolling, etc.) and the device/browser. The communication happens using JSON-wire protocol over HTTP.

About AppiumDriver

AppiumDriver is a generic implementation of the interface WebDriver. In addition, we have specialized implementations like AndroidDriver, IOSDriver, etc., providing device type-specific commands.

The following table shows the methods that are available in AppiumDriver:

Method Description
driver.activateApp(bundleId) activates the given app if installed but not running, or if it is running in the background
driver.close() close the current window/quitting the browser if it’s the last window currently open
driver.closeApp() closes the app that was provided in the capabilities at session creation and quits the session
driver.launchApp() launches the app provided in the capabilities
driver.context(name) switches the focus of future commands, for this driver, to the context with the given name
driver.execute(String driverCommand, Map<String, ?> parameters) executes the JSONWP command and returns a response
driver.findElement(By) finds the first WebElement matching the given locator
driver.findElements(By) finds all the WebElement matching the given locator
driver.get(url) opens the given URL in the current browser tab
driver.getTitle(url) gets the title of current browser tab/window
driver.getCurrentUrl() returns the URL of the page currently loaded in the browser
driver.getContext() gets the current context handle
driver.getContextHandles() returns the set of context handles that can be used to iterate over available contexts
driver.getWindowHandle() get the current window handle
driver.getWindowHandles() returns the set of window handles which can be used to iterate over available handles
driver.switchTo().frame(index or name) switches the control to the frame identified by index or name
driver.manage().timeouts().implicitlyWait(time, timeUnit) for setting the implicit wait before failing to find an element
driver.getPageSource() returns the source of the current visible screen
installApp(String appPath) installs the given app on the mobile device
isAppInstalled(String bundleId) checks whether the given app is installed on the mobile device
driver.location() returns the physical location information if present, otherwise returns null
driver.setLocation(location) sets the GPS location of the device. Only works on emulators
driver.pullFile(remoteFile) returns the base64 string representing the file content from the simulator/device
driver.pullFolder(remoteDirectory) returns the base64 string representing the zipped folder contents from the simulator/device
driver.queryAppState(bundleId) returns the state of the given app
driver.quit() closes all the tabs and windows associated with the browser
driver.removeApp(bundleId) uninstalls the given app
driver.resetApp() resets the currently running app together with the session
driver.rotate(screenOrientation) rotates the device
driver.rotate(deviceRotation) rotates the device
driver.rotation() rotates the device as given in the capabilities
driver.runAppInBackground(duration) this synchronizes the method block and puts the app in the background until the given time ends.
driver.terminateApp(bundleId) terminates the given app if running
driver.manage().logs().get(logType) fetches the log entries of the given type
driver.getScreenshotAs(X) fetches the screenshot of the current visible screen as the given type X, where X can be file or bytes

About AndroidDriver

AndroidDriver extends AppiumDriver.In addition to the methods available in AppiumDriver, AndroidDriver exclusively has the following methods:

Method Description
driver.openNotifications() opens the notification shade on android devices
driver.getClipboardText() gets the text copied to clipboard
driver.setClipboardText(text) sets the clipboard text
driver.isDeviceLocked() checks if the device is unlocked
driver.isKeyboardShown() checks if the keyboard is shown
driver.lockDevice() locks the screen of the device
driver.lockDevice(duration) locks the screen of the device for the given duration and then returns
driver.unlockDevice() unlocks the device
driver.sendSMS(phoneNumber, message) emulates sending SMS. Only works on Android emulators.
driver.makeGsmCall( phoneNumber, gsmCallActions) emulates call action. Only works on Android emulators.
driver.toggleData() toggle to use mobile data. Works on emulators and rooted android devices
driver.toggleAirplaneMode() toggles airplane mode on android device
driver.toggleLocationServices() toggles location
driver.startRecordingScreen(options) starts screen recording asynchronously in base64 encoded string
driver.stopRecordingScreen(options) returns the recorded video as base64 encoded string
driver.setNetworkSpeed(networkSpeed) sets the network speed of the device. Only works on Android emulators.
driver.setPowerCapacity(percentage) sets the battery capacity of the device. Only works on Android emulators.

About IOSDriver

IOSDriver extends AppiumDriver.In addition to the methods available in AppiumDriver, IOSDriver exclusively has the following methods:

Method Description
driver.getClipboardText() gets the text copied to clipboard
driver.setClipboardText(text) sets the clipboard text
driver.shake() simulates shaking the device
driver.isDeviceLocked() checks if the device is unlocked
driver.isKeyboardShown() checks if the keyboard is shown
driver.lockDevice() locks the screen of the device
driver.lockDevice(duration) locks the screen of the device for the given duration and then returns
driver.unlockDevice() unlocks the device

MobileElement

The MobileElememt is a subclass of WebElement and is specific to Appium and helps with locating/identifying an element in the app.

The below table shows the methods applicable for MobileElement:

Method Description
element.clear() clears the contents of the WebElement
element.click() clicks on the WebElement
element.findElement(By) finds the child WebElement
element.getAttribute(name) gets the attribute of the WebElement
element.getCenter() gets the centre point of the WebElement
element.getCoordinates() gets the coordinates of the WebElement
element.getLocation() gets the location of the WebElement
element.getRect() gets the rectangle of the WebElement
element.getScreenshotAs(X) gets the screenshot of the WebElement as X where X can be file or bytes
element.getSize() gets the size of the WebElement
element.getText() gets the text of WebElement
element.isDisplayed() checks if the WebElement is displayed
element.isEnabled() checks if the WebElement is enabled
element.isSelected() checks if the WebElement is selected
element.sendKeys(keys) sends keys onto the WebElement

TouchActions

In addition to the above methods, we can also perform some special functions on MobileElement like TouchActions.

We can initialize the TouchActions as shown below:

import io.appium.java_client.TouchAction;

TouchAction<?> action = new TouchAction<>(driver);
        action
                .press(
                    PointOption.point(new Point(x, y))
                )
                .longPress(
                    PointOption.point(new Point(x, y))
                )
                .longPress(
                    LongPressOptions
                        .longPressOptions()
                        .withDuration(Duration.ofSeconds(1))
                        .withElement(ElementOption.element(webElement))
                )
                .moveTo(
                    PointOption.point(new Point(x, y))
                )
                .press(PointOption.point(new Point(x, y)))
                .tap(
                    PointOption.point(new Point(x, y))
                )
                .tap(
                    TapOptions
                        .tapOptions()
                        .withElement(
                            ElementOption.element(webElement)
                        )
                        .withTapsCount(1)
                )
                .waitAction(
                    WaitOptions.waitOptions(
                        Duration.ofSeconds(1)
                    )
                )
                .release();
        action.perform();

The above code shows the methods that are applicable on TouchAction. We can add one or more methods and form a chain of actions. At the end of the chain, we call .perform() to perform the chain of actions.

We can perform scroll by performing the following actions in a sequence:

import io.appium.java_client.TouchAction;

TouchAction<?> action = new TouchAction<>(driver);

actions
        .press(PointOption.point(startX, startY))
        .waitAction(WaitOptions.waitOptions(Duration.ofMillis(100)))
        .moveTo(PointOption.point(endX, endY))
        .release()
        .perform();

Explicit waits

Explicit waits are for halting the program execution or freezing the thread until the condition is satisfied.

// initialize with driver and maximum timeout in seconds
WebDriverWait wait = new WebDriverWait(driver, 3)
    .ignoreAll(Arrays.asList(StaleElementReferenceException.class))  // ignoring given exceptions while waiting for the condition to be satisfied
    .pollingEvery(Duration.ofSeconds(1)); // how often to check if the condition is satisfied
wait.until(ExpectedConditions.elementToBeClickable(By));

Note: org.openqa.selenium.support.ui.ExpectedConditions has many implementations for ExpectedCondition<T> and we can use custom implementations as well.

Get hands-on with 1200+ tech skills courses.