You identified a keyword worth targeting. Maybe a competitor is ranking for it and your app is not. Maybe your keyword tracking surfaced it as an opportunity with real demand and beatable competition.
The next step seems obvious: update your metadata and test it.
The problem is that most developers run the test wrong. They change three things in one update, check after a week, draw a conclusion, and change three more things. After four cycles of this, they have no idea what actually moved their rankings.
Here is how to run a keyword test that produces readable signal.
Why every metadata update resets the clock
When you submit a metadata update to the App Store, Apple re-indexes your app against the new signals. That indexing process takes time. At day 7, your rankings are in transition. At day 14, they are more stable. At day 21, you can trust the movement as representative of your new metadata rather than an artifact of reindexing.
This is not a soft guideline. It is the practical constraint of how Apple’s algorithm processes changes.
The implication: if you submit a metadata update on day 1 and another on day 10, the second update resets the clock before the first has finished indexing. You end up with two overlapping experiments and zero clean signal from either.
The developers with the clearest keyword data run one update every 21 to 42 days. Not faster. That constraint is what makes the data useful.
The three fields, from least to most disruptive
Not all metadata changes carry equal risk to existing rankings.
Keyword field (100 characters). This is the lowest-risk place to test a new keyword. It is not visible to users, so changes here do not affect click-through rate or conversion. It is indexed primarily for search ranking. When testing a keyword you have not confirmed before, start here. If the keyword does not move your ranking after 21 days, you have learned something without touching anything users see.
Subtitle (30 characters). The subtitle appears in search results and on your product page. A change here affects both keyword indexing and the copy that a user reads before tapping in. Test subtitle changes after you have validated a keyword at the field level, or when you have enough confidence to commit the keyword to a visible position.
Title (30 characters). The title carries the most ranking weight and is the first thing a user sees in search results, on the store listing, and on the device home screen. Title changes produce the largest ranking swings in both directions. Change the title only after you have confirmed a keyword works at the field or subtitle level, and only when you are ready for the full 21-day re-indexing window.
The sequence: keyword field first, subtitle second, title last. You are moving from validation to commitment.
Running a clean test
A clean test changes one variable in one update cycle.
In practice:
- Identify the keyword you want to test.
- Remove the lowest-performing keyword from your current keyword field.
- Add the new keyword in its place.
- Submit the update. Record the date.
- Wait 21 days.
- Check the rank position for your new keyword against your baseline.
That is the complete experiment. One keyword in, one keyword out, one 21-day window.
Marteso’s version history logs each metadata update automatically. On day 21 you can pull up the version, see exactly what changed, and compare keyword rankings before and after without rebuilding context from notes or memory.
Reading the signal at day 21
At day 21, you are looking for one of three outcomes.
The keyword moved positively. It entered ranking or improved position. Keep it for another cycle and consider whether reinforcing it in the subtitle would compound the gain.
The keyword did not move. It stayed flat or did not enter ranking at all. This is data, not a failed test. The keyword may need title-level placement to index at sufficient weight, or it may require more relevance signals from ratings and engagement. Note the result and rotate a different candidate into the slot.
The keyword moved, but other keywords dropped. You have a trade-off. The new term may be competing with existing keywords for indexing weight in the same semantic cluster. Decide which ranking matters more strategically before your next update.
The one mistake that makes tests unreadable
Most bad keyword tests are not bad because the keyword was wrong. They are bad because something else changed in the same window.
App updates, rating prompt additions, or screenshot changes submitted alongside a keyword update all introduce variables. The ranking shift you observe at day 21 could be from the keyword, the screenshots, the rating prompts, or some combination. You cannot know which.
Isolate keyword tests from other changes when possible. If you need to submit a bug fix during an active keyword test, note it in your version history. If you need to change screenshots and keywords at the same time, accept that the test will be less clean, and wait for a longer window before drawing conclusions.
The compounding effect
The developers with the strongest keyword coverage after 12 months are not running annual ASO audits. They are running consistent, single-variable cycles.
After eight clean 21-day tests, you have replaced your eight weakest keyword slots with the eight best-performing alternatives you have found. That is a keyword field built from evidence. Each cycle makes the next one faster because you have a clear record of what worked, what did not, and why.
Marteso’s keyword tracking shows 7-day and 30-day rank trend lines per keyword alongside your full metadata update history. The weekly habit is simple: confirm that your current metadata is still working, identify any signal worth acting on, and plan the next single-variable test. Most weeks, nothing changes. When a real signal appears, you have the structure to act on it cleanly.
The one thing to do today
Open your keyword field. Find the lowest-performing keyword in your current 100 characters. Identify one replacement candidate from your suggestions or competitor analysis.
Plan a single update: swap that one keyword, submit it, log the date, and check back in 21 days.
That is the complete test. Run it once and see what you learn. Then run it again.