UI Test automation design framework with Selenium

Adam Samsonowicz

In this article I will show you design approaches to improve readability and maintanability of your selenium framework. If your page objects are too big or selenium util classes are too complex then it’s for you. Let me introduce our PageObject that you could have seen before.

public class Item {

    @FindBy(id="description")
    private WebElement description;
    @FindBy(id="addmore")
    private WebElement addMoreButton;
    @FindBy(id="itemType")
    private WebElement itemTypeDropdown;

    public Item(WebDriver driver) {
        PageFactory.initElements(driver, this);
    }

    public String getDescription() {
        return description.getText()
    }

    public void increaseItemQuantity() {
        addMoreButton.click();
    }

    public void selectItemType() {
        SeleniumUtils.selectWithDropdown(itemTypeDropdown, value);
        //or other approach with handling more complex UI interaction logic
    }
}

or

public class Item {

    private By description = By.id("description")

    private By addMoreButton = By.id("addmore")

    private By itemTypeDropdown = By.id("remove")

    private WebDriver driver;

    public Item(WebDriver driver) {
        this.driver = driver;
    }

    public String getDescription() {
        return SeleniumUtils.getText(description);
    }

    public void increaseItemQuantity() {
        SeleniumUtils.click(addMoreButton);
    }

    public void selectItemType(String value) {
        SeleniumUtils.selectWithDropdown(itemTypeDropdown, value);
    }
}

What’s wrong with this?

It’s hard to say without the context, if the system contains only one „Item” and that’s all, then it would be a good fit. What if there are more Items and the system is more complex? Then first of all our id locator wouldn’t work as ids would be duplicated, also it would be harder to maintain the framework with such a approach.

I would like to show you how to improve this design in a few steps.

Element wrappers

  • Problem
    You may have to deal with a system that integrates UIs of other systems e.g. iframes.
    Different libraries are used for inerface components like dropdowns, comboboxes, numericboxes etc.
  • Solution
    Instead of keeping your element interaction logic in one place move it to the elements themselves.
    So from design we have seen in books or many tutorials like this
public class SeleniumUtils {
    public static String getText(WebElement element) {
        //extracting logic
    }

        public static String getText(By elementLocator) {
        //extracting logic
    }

    //other helper methods, skipper for brevity
}

Which can expand to something like this

public static String getText(WebElement element) {
    if(condition1) //text extracting logic for one kind of elements
    else if(condition2) //text extracting for other elements
    //and so on
}

Of course it could be done in a more elegant way than just adding if statements to one method, but the point is that it’s still not enough to achieve some kind of common sense readability of this class. A Good example of such a usecase would be getting text from 'input’ and text holding elements like 'span’ or 'td’. To extract data from 'input’ we have to get data from 'value’ attribute and for 'span’ or 'td’ simple selenium getText() is enough.

That’s only a data extracting method, imagine dealing with multiple ways of interacting with dropdown selection.

Instead of keeping one class with selenium actions let me show you the following approach.

public interface Element extends WebElement {
    WebElement getWrappedElement();
}

class ElementImpl implements Element {
    protected final WebElement element;
    protected JavascriptExecutor js;


    public ElementImpl(final WebElement element) {
        this.element = element;
        this.js = (JavascriptExecutor) (((WrapsDriver) getWrappedElement()).getWrappedDriver());
    }

    @Override
    public void click() {
        element.click();
    }

    @Override
    public boolean isDisplayed() {

        return element.isDisplayed();
    }

    public WebElement getWrappedElement() {
        return element;
    }

    @Override
    public void submit() {
        element.submit();
    }

    @Override
    public void sendKeys(CharSequence... charSequences) {
        if (isDisplayed()) {
            element.sendKeys(charSequences);
        }
    }

    @Override
    public String getText() {
        isDisplayed();
        return element.getText();
    }

    @Override
    public void clear() {
        if(isDisplayed()) element.clear();
    }

    @Override
    public String getAttribute(String s) {
        return element.getAttribute(s);
    }

    private boolean isAttachedToDOM() {
        try {
            element.isEnabled();
        } catch (StaleElementReferenceException e) {
            return false;
        }
        return true;
    }

}

We have wrapped our simple selenium WebElement into Element.
Okay, but what’s the difference and why should I bother myself with writing additional code when I can just pass xPath or WebElement into the helper function and get the job done?
The real benefit of this solution comes into place with the example below.

public interface Input extends Element {
    void clear();
    void insert(String text);
    String getText();
}

class InputInSystemX extends ElementImpl implements Input {

    private final int productHeight = 134;

    public InputImpl(WebElement element) {
        super(element);
    }

    @Override
    public void insert(String text) {
        sendKeys(text);
    }

    @Override
    public String getText() {
        return getAttribute("value");
    }

}

class InputInSystemY extends ElementImpl implements Input {

    public InputImpl(WebElement element) {
        super(element);
    }

    @Override
    public void insert(String text) {
        js.executeScript("arguments[0].value = '" + text + "'", element);
        js.executeScript("$(arguments[0]).trigger('change')", element);

    }

    @Override
    public String getText() {
        return getAttribute("value");
    }

}

You can see that we have separated the logic into 2 classes. The benefit here is that if you need to implement more complex behavior like KendoUI for jQuery it is completely separated to the other approach and in most scenarios two separate systems’ elements are not related.

public interface Dropdown {
    void select(String value);
} 

class DropdownInSystemX implements Dropdown {

    public DropdownInSystemX(WebElement element) {
        super(element);
    }

    @Override
    public void select(String value) {
        Select dropdown = new Select(element);
        dropdown.selectByVisibleText(value);
    }
}

class DropdownInSystemY implements Dropdown {

    public DropdownInSystemY(WebElement element) {
        super(element);
    }

    @Override
    public void select(String value) {
        waitUntil(() -> ((Boolean) js.executeScript("return jQuery.active == 0")));

        var dataTextField = (String) js.executeScript(
                "return $(arguments[0]).data('kendoDropDownList').options.dataTextField", element);
        var kendoApiScript = "$(arguments[0]).data('kendoDropDownList').select(function(dataItem){return dataItem." +
                dataTextField +
                " === \"" + value + "\"})";

        js.executeScript(kendoApiScript, element);
        js.executeScript("return $(arguments[0]).data('kendoDropDownList').trigger('change')", element);
    }
}

Hopefully it illustrates the benefits of separation. One system can use the jQuery library, but that may not be the case for the other ones, also introducing 'if statements’ is unnecessary.

We have described how to wrap Selenium WebElements so let’s see how our PageObject has changed in this process.

public class Item {


    private Element description;

    private Button addMore;

    private Dropdown itemType;

    public Item(WebDriver driver) {

    }

    public String getDescription() {
        return description.getText();
    }

    public void increaseItemQuantity() {
        addMore.click();
    }

    public void selectItemType(String value) {
        itemType.select(value);
    }
}

it’s a little cleaner and we can also remove Selenium helper class.

public class SeleniumUtils {

Okay, but how to handle element initialization and when to initialize elements?

As you may remember the ElementImpl constructor accepts WebElement as an argument. So the simplest way would be to pass WebElement into the constructor, but the goal is to keep our class clean, so we need another approach.

    public static <t extends="" element=""> T getElement(WebElement element, Class<!--?--> clz) {
        String elementType = clz.getSimpleName();
        return (T) switch (elementType) {
            case "Element" -> new ElementImpl(element);
            case "Input" -> new InputImpl(element);
            case "Button" -> new ButtonImpl(element);
            case "Dropdown" -> new KendoDropDownList(element);
            default -> throw new NoSuchWebElementTypeException("Element Type was not found in the list.");
        };
    }

Okay, element wrapper implementation remains hidden.

To indicate an element path we can use java annotations.

@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.FIELD)
public @interface Locator {
    String xPath() default "";
}

So PageObject now looks like this.

public class Item {

    @Locator(xPath="//xpath to description")
    private Element description;
    @Locator(xPath="//xpath to addMore")
    private Button addMore;
    @Locator(xPath="//xpath to itemType")
    private Dropdown itemType;

    public Item(WebDriver driver) {

    }

    public String getDescription() {
        return description.getText();
    }

    public void increaseItemQuantity() {
        addMore.click();
    }

    public void selectItemType(String value) {
        itemType.select(value);
    }

}

Having our elements annotated we can now initialize them. There are two ways of doing it, eager initialization on object creation or lazy initialization on element access, in this tutorial the second way will be shown as it makes more sense in most cases. To do it we need one more helper method that will actually return the element object.

    public static <t extends="" element=""> T initElement
            (WebDriver driver, Class<!--?--> pageClass, String fieldName) {
        T result = null;

        try {
            Class<t> classInterface = (Class<t>)
                    pageClass.getDeclaredField(fieldName).getType();

            result = getElement(
                    driver.findElement(
                            xpath(pageClass.getDeclaredField(fieldName)
                                    .getAnnotation(Locator.class)
                                    .fullPath()
                            )), classInterface);
        } catch (NoSuchFieldException e) {
            //handle exception
        }

        return result;
    }

And now PageObject looks like this.

public class Item {

    @Locator(xPath="//xpath to description")
    private Element description;
    @Locator(xPath="//xpath to addMore")
    private Button addMore;
    @Locator(xPath="//xpath to itemType")
    private Dropdown itemType;

    private WebDriver driver;

    public Item(WebDriver driver) {
       this.driver = driver;
    }

    public String getDescription() {
        return getDescription().getText();
    }

    public void increaseItemQuantity() {
        getAddMore().click();
    }

    public void selectItemType(String value) {
        getItemType().select(value);
    }

    private Element getDescription() {
        return initElement(driver, Item.class, "description");
    }

    private Button getAddMore() {
        return initElement(driver, Item.class, "addMore");
    }

    private Dropdown getItemType() {
        return initElement(driver, Item.class, "itemType");
    }

}

So far we have removed SeleniumUtils class which conducted us to treat elements more like a real objects that can take care of themselves.

With a proper approach to elements let’s now move to splitting our page object.

Split page object into classes (Structure, Act/Command, Get/Query)

Quick description:

  • Structure object holds Page Object’s elements and is responsible for providing intialized elements.
  • Act/Command object is responsible for user input/interaction with the system.
  • Get/Query object is responsible for retrieving what the system displays to the user.

Let’s move to implementation.

The first thing to improve in the current state of Item PageObject would be to move element initialization outside.

This will be achieved with the following steps…

  1. Move elements to another class.
class ItemStructure {

    @Locator(xPath="//xpath to description")
    private Element description;
    @Locator(xPath="//xpath to addMore")
    private Button addMore;
    @Locator(xPath="//xpath to itemType")
    private Dropdown itemType;

    private WebDriver driver;

    ItemStructure(WebDriver driver) {
        this.driver = driver;
    }

    private Element getDescription() {
        return initElement(driver, Item.class, "description");
    }

    private Button getAddMore() {
        return initElement(driver, Item.class, "addMore");
    }

    private Dropdown getItemType() {
        return initElement(driver, Item.class, "itemType");
    }
  1. Inject structure into PageObject.
public class Item {

    @Locator(xPath="//xpath to description")
    private Element description;
    @Locator(xPath="//xpath to addMore")
    private Button addMore;
    @Locator(xPath="//xpath to itemType")
    private Dropdown itemType;

    private ItemStructure structure;

    public Item(WebDriver driver) {
       this.structure = new ItemStructure(driver);
    }

    public String getDescription() {
        return structure.getDescription().getText();
    }

    public void increaseItemQuantity() {
        structure.getAddMore().click();
    }

    public void selectItemType(String value) {
        structure.getItemType().select(value);
    }

}

Now PageObject looks neat.

So what’s the next step and why?

We could finish our refactoring here, but there is still the case of mixing methods that are responsible for retrieving data from the page like getDescription with methods that interact with the system like increaseItemQuantity().

  1. Treat PageObject as a Facade to real logic.
public class Item {
    private ItemCommand command;
    private ItemQuery query;

    private Item() {}
    private Item(ItemCommand command, ItemQuery query) {
        this.command = command;
        this.query = query;
    }

    public static Item getItem(WebDriver driver) {
        ItemStructure structure = ItemStructure.init(driver);
        return new Item(new ItemCommand(strucutre), new ItemQuery(structure));
    }

    public ItemCommand commandTo() {
        return command;
    }

    public ItemQuery query() {
        return query;
    }
}
  1. Introduce Command and Query separation.
public class ItemCommand {

    private ItemStructure structure;

    ItemCommand(ItemStructure structure) {
        this.structure = structure;
    }

    public ItemCommand selectItemType(String value) {
        structure.getItemType().select(value);
        return this;
    }

    public ItemCommand increaseQuantity() {
        structure.getAddMore().click();
        return this;
    }

}

public class ItemQuery {

    private ItemStructure structure;

    ItemQuery(ItemStructure structure) {
        this.structure = structure;
    }

    public String itemDescription() {
        return structure.getDescription().getText();
    }
}
  1. So our client can look more readable.
    @Test
    void increasesItemQuantityAndSelectItemType() {
        Item item = Item.getItem(driver);

        item.commandTo()
            .increaseQuantity()
            .selectItemType("our item type goes here");

        //validations
    }

    @Test
    void retrievesDescription() {
        Item item = Item.getItem(driver);

        String itemDescription = 
            item.query()
                .itemDescription();
        //validations
    }

Okay, so why this could be benefitial?

Again, we have a clear separation of responsibilities, the command class is resposible for all interactions, user input actions and the query class are responsible for getting and mapping the data from UI. Such a separation will be even more advantageous with more complex systems when mapping numeric values is taken into account so the best place to do it would be in the query class. It also helps in Data Driven Testing while validating the input object with an actual one on UI, validating every property on UI one by one would be a mess but in this case creating actual UI object could be done in the query class.

Conclusion

You can use both of these approaches integrated or separated if one of them doesn’t meet your project requirements.

Both of these approaches require some programming knowledge as well as initial time to implement the core, so you have to decide by yourself if this approach will benefit you and the team in the future or not.

In conlusion you have to choose wisely what path you are taking, some simple systems may look innocent at a first glance, but if they grow then tests grow also (or they should). If you don’t want to change your whole approach or you just simply can’t, consider following less impacting approaches like treating Page Object more like Element/Section Objects rather than the whole page.

https://docs.telerik.com/kendo-ui/api/javascript/ui/combobox/methods/select

https://app.pluralsight.com/library/courses/automated-tests-java-fluent-interface-webdriver-selenium/table-of-contents

https://martinfowler.com/bliki/PageObject.html

Poznaj mageek of j‑labs i daj się zadziwić, jak może wyglądać praca z j‑People!

Skontaktuj się z nami